Event Management has been a feature in ServiceNow for a quite long time. Starting from Geneva release a new feature was shown called: Service Analytics. After Helsinki was released, Service Analytics has been available to all clients in Event Management module. Before going in more details, I would like to talk about machine learning and AI.
Machine Learning has been a hot topic for this decade. The companies and industries has shown big interest. According to SAS “Machine learning is a method of data analysis that automates analytical model building. Using algorithms that iteratively learn from data, machine learning allows computers to find hidden insights without being explicitly programmed where to look.”
Usually for any given problem we use simple and linear regression models. But when it comes to usage of a supervised learning model like artificial intelligence or neural network, the things change. Would a deductive reasoning will be sufficient, or is inductive reasoning needed?
There are 2 main areas of ServiceNow analytics you need to be familiar with: Correlated alerts groups (an automated way to correlate time-correlated alerts) and the other is Root-cause CI analysis (which is a tool to provide an automated way for identification of the root cause CI, for Service Mapping topologies.
Both techniques are good and they help to provide event analysis without using human interaction. They rapidly identify of which services impacts should be addressed first. This is just the beginning. Service Analytics is a very huge area where the interest is growing fast from developers.
What future premises?
Its undeniable that we are heading very fast to a new era of managing data and manipulate them. With a fast growth of data amount there must be some way to process them fast and use a machine learning system. Using machine learning algorithms in ServiceNow analytics, we can filter through the noise of alerts and present appropriate groups to identify patterns from what was learned in the past.
Istanbul release was a big step into additional jumps by introducing Operational Metrics. Curious? Let’s look…
Instead taking and analysing the events from monitoring tools, we are going to use raw metric data. This method will allow us to apply machine learning algorithms to automatically recognize the threshold for a given metric at a given time. Let’s see an example: suppose we have the metric data on a DB transactions per second, with enough data points. We can begin to learn the data pattern and set the upper and lower bounds (threshold) that should be. In simple words suppose DB transactions per second is around 1000 per second: the machine learning will suggest upper bound should be no more than 3000 and lower bound should be no less than 100. So, values stay between 3000 – 100, and an anomalous event with a score 0 -10 would be generated depending on how far the value deviates from the model. This can occur an anomaly alert allows us to tie it all together back to Event Management.
This means you will no longer suffer outage? Not quite. We all want to prevent outages rather restore them. To do this we need machine learning to get closer to that goal. This will not solve the issue completely but we are going to make a long journey. A.I will take part as a solution and future operations for the world.
To give a better view what this will look like, lets introduce dashboard called: Anomaly Map. This map will show us different anomalies Cis (servers, applications etc.) have happened during time.
To get a better idea, we can go into the specific metric and see the data trend. We can see the raw metric value (blue line), the threshold (red dotted line), and the anomaly score (the green, blue, yellow lines on the bottom) across time.
With all these new capabilities, we’re beginning to see the possibilities opening up when it comes to managing the health of your business services. This is where we’re making very active investment with operational intelligence; by ingesting different types of data from different sources and overlay them on top of the CMDB, we can provide not only descriptive analytics, but also predicative analytics in order to prevent outages before they happen!
In the coming article, we will give a more real example on predictive analytics and the endless possibilities you have. Next article will contain more illustrated examples and ideas what you can do with predictive and A.I analytics.