ServiceNow Lightstep Incident Response Management Boosts Observability
ServiceNow expands its Lightstep observability platform with an incident response management service to help IT operations quickly identify and remediate issues.
ServiceNow has extended its Lightstep observability platform with a new incident response management service that became generally available on March 14.
Since ServiceNow acquired Lightstep in May 2021, it has been steadily improving the platform. With the new incident response service, ServiceNow now has a set of capabilities that can help organizations not only observe IT operations issues, but also identify them quickly — and provide a path to remediation.
Incident response is a broad category in IT operations that can include response to cybersecurity, performance, and availability issues.
"The reliability of customer-facing applications is kind of do or die for most enterprises, and so there's a ton of strategic importance around reliability initiatives, and Lightstep's brand has always been tied to those things," Lightstep's co-founder and CEO Ben Sigelman told ITPro Today. "But as part of ServiceNow, we're able to move a lot faster."
How ServiceNow Lightstep Incident Response Brings Workflow to Observability
Lightstep's observability technology provides details on how a given service or application is running. What's often missing from observability is the concept of workflow — that is, a set of integrated processes that enable IT operations professionals to quickly act on observability data.
"The workflow needs to be really precise and crisp, and that's why we're doing this announcement," Sigelman said. "Incident management is the lifeblood of reducing MTTR [mean time to resolution]."
ServiceNow_0
Observability tools can provide information about logging and application performance metrics, according to Rohit Jainendra, vice president and general manager of emerging businesses at ServiceNow. Site reliability engineers (SREs) and IT operations professionals commonly use multiple observability tools and then have to manually put information together when an incident occurs to figure out how to fix the issue, Jainendra said.
Workflow automation is a foundational element of the ServiceNow platform, and that foundation is now being put in place for incident management.
"You can set up rules about who to notify and when to notify them," Jainendra told ITPro Today. "It brings out the core technologies that we have in ServiceNow."
Identifying Issues for Incident Response Management
The Lightstep Incident Response platform is able to ingest alerts and data from multiple observability tools beyond just the core Lightstep observability technology, including New Relic and Datadog.
"We ingest alerts from all these different tools because as we talked to customers, they were using multiple tools," Jainendra said.
When gaining visibility into IT operations via multiple observability tools, the Lightstep Incident Response management service is able to build out workflows around collaboration, incident investigation, and remediation all within a single service.
Service availability is a main mission of ServiceNow, Jainendra said. Availability issues can be the result of poor performance or an outage, where a service just isn't working at all. Security problems, such as an attack, or a vulnerability can also lead to service availability issues.
While observability tools can help identify a potential incident, so too can user-facing portals that an organization's customers interact with.
"The truth of the matter is a lot of times the person who notices that there's a problem is actually the customer, and then they call in and report that issue," Jainendra said.
"Having the right workflow around how to take that incoming customer issue, route it to the right team, and make sure that the right team is responding right away is part of the platform."
Routing a given issue to the right team or person is a core value of the Lightstep Incident Response platform, according to Sigelman.
"There's a huge amount of value in reducing the time to understanding an issue and frankly time to innocence, which is sort of a joke that we say internally," Sigelman said. "There are a lot of teams along the way that have nothing to do with an incident, but the issue passes through their services and if you can help them get back to their job, that's actually a major productivity win for everyone."
About the Author
You May Also Like