Great abilities of Event Management bring organizations to the next level of IT operations with ServiceNow
What is Event Management
As part of ITOM suite, Event Management is an application built on ServiceNow Now Platform which collects all events from a variety of monitoring systems into 1 place and then does the magic with the data to produce actionable alerts and incidents.
This out-of-the-box functionality can cooperate with many monitoring tools which are often already installed within organizations, such as:
- HP Operations Manager
- HP OMi
- IBM Netcool/OMNIbus
- Microsoft SCOM (also metric collection in Operational Metrics)
- Nagios XI
- OP5 NEW
- Opsview NEW
- PRTG NEW
- SolarWinds NPM and SAM
- VMware Hyperic
- VMware vRealize Operations
- Amazon Web Services
- BMC TrueSight
- Oracle Enterprise Manager
- REST (ideal for custom integrations)
- Web Services
- SNMP Traps (very useful for custom integrations)
- Email (integration method of last resort)
- Datadog NEW
How Event Management works
Event Management applies MID Server to collect events from the infrastructure using connectors to the third party monitoring tools. Event is a notification from a CI or a cloud that IT team should be aware of. Events are categorized as:
- Critical: Immediate action is required. The resource is either not functional or critical problems are imminent.
- Major: Major functionality is severely impaired or performance has degraded.
- Minor: Partial, non-critical loss of functionality or performance degradation occurred.
- Warning: Attention is required, even though the resource is still functional.
- Info: An alert is created. The resource is still functional.
- Clear: No action is required. An alert is not created from this event. Existing alerts are closed.
Events can be collected in 2 different ways:
The MID Server can use connectors to pull the events from third party monitoring tools or the monitoring tools can push the data to the instance. When the events are pushed to the instance some listeners have to be configured either on the MID Server or directly on the instance as a Web Service. The monitoring tools can also send events to the instance by email.
Event Management Monitoring via Now Platform
Apart of events, also metrics are collected. Metric is a measure of operating characteristic for a device over a certain time period like CPU or memory usage. Events and metrics are collected by the platform to help IT team to monitor health, prevent outages and resolve issues quickly with a minimum impact on the operations. Collected events can trigger an alert to notify the IT team about the issue. This is done by defining event rules that map event fields to the alert fields as well as bind specific CI to the alert. Event management uses Operational Intelligence to filter out certain alerts by applying thresholds, group the alerts and identify primary alert. Finally, Event Management uses alert rules to initiate remediation actions like creating an incident, launching a remediation workflow or recommending a knowledge article. Event Management calculates an impact on CIs. Operational Intelligence normalizes metrics and identifies anomalies which are sent to the instance for further processing.
Event Management Dashboard – and how it’s used
Event Management Dashboard is the way the events are monitored. The dashboard can represent business services or alert groups.
- The size of a tile represents priority of a service or a group.
- The color indicates the highest severity alert in the group or a service.
- Alerts related to particular group or a service are visible at the bottom of the dashboard.
The topology map is available for a service when it’s double clicked on the dashboard. The topology map shows the impact tree and relationships as well as related alerts down below. The impact history scroll bar can be used to move back over time to see the historical data.
Event Management overview is another way to see the health status of the infrastructure.
Anomaly Events and Operational Intelligence
Operational Intelligence is also used for collecting performance metrics from the system. It uses event rules to bind CIs to the metrics. Metric data is normalized by the application of statistical model. Thanks to that a normal range of value is recognized. All metric values are checked against the model and values outside the range are identified as anomalies to be sent to the instance as an anomaly event. On the instance event rules create anomaly alert based on anomaly events. Anomaly alert can initiate creation of an incident, launching a remediation workflow or recommending a knowledge article.
Anomalies can be reviewed with Anomaly Map module. Anomaly Map gives an overview of anomalies for individual CIs. The chart shows anomaly history. The color of tiles represents the highest anomaly score for the selected CI at that time.
Metric Explorer is very useful module to review the metrics and what’s more – to build. It allows to see the metrics on a single pane or to correlate multiple metrics or see the anomaly scores and anomaly bounds on the chart.
Power of Event Management and Operational Intelligence
Event Management together with Operational Intelligence help shared services organizations centralize the IT operations by delivering single pane of glass of business service health, and solve problems much faster. The benefits are:
- reduction of Mean Time To Repair,
- understanding of root cause,
- increase in the existing tools value,
- improvement in service availability.