Machine learning can elevate an IT organization’s performance monitoring strategy – and, given the wide array of mature algorithms and frameworks now available, it’s not as complicated to learn as it used to be.
Copyright by searchitoperations.techtarget.com
Machine learning plays a growing role in IT organizations. Check out these articles on how IT teams can use this type of
Let’s review some of the key concepts related to
Most monitoring systems vacuum up logs, parse individual fields and then display them on a dashboard . But to predict or detect an outage, or anticipate a surge in demand, IT teams require metrics from a wide variety of systems — business, technical and external — fed into an algorithm.
However, each application and business is different, so there cannot be one single algorithm built into performance monitoring tools.
Instead, IT admins must write this code themselves. This process isn’t terribly complicated, but does require knowledge of
Think of
There are metrics that affect IT system performance, such as spikes in traffic volume, and metrics that reflect them, such as web page latency. Machine learning enables admins to use both, which is another improvement over the log-scraping-in-isolation approach to IT monitoring. Some metrics an IT admin could plug into that “black box” include:
- network traffic volume, by source and target IP address;
- memory;
- storage in use;
- end-to-end app latency;
- replication latency; and
- message queue length.
An admin with a spreadsheet of this data would deduce which data points might be correlated. Data science eliminates the guesswork, as it points out which items actually correlate, and provides the tools to flag anomalies and make predictions regarding system health and demand. And it can do this with hundreds of metrics, whereas a human with a spreadsheet can look at only two or three.
The challenge with labelled data
There are two primary kinds of
To support these predictive models, however, IT teams require classified or labelled data. This data captures relationships between a cause and an effect — such as a certain IT metric that results in a certain system status. A lack of comprehensive labelled data sets remains one of the biggest hurdles to the use of
Let’s look at an example of how to use
0 Comments