Nobl9 Advances State of Service-Level Objectives at SLOconf 2023
Site reliability engineering efforts get a boost with new SLO updates, including a preview of generative AI.
Nobl9 hosted its annualSLOconf event May 15-18, providing viewpoints on the current and future state ofservice-level objectives (SLO).
Alongside the conference, the company announced a series of product updates for its platform, which helps organizations with the whole process of using, managing, and measuring service-level objectives as a way to optimize operations. The concept of SLO is related to growing practice ofsite reliability engineering (SRE) and helping to design and maintain resilient systems.
Nobl9 created its technology to assist organizations in achieving their service-level objectives (SLOs). SLOs establish the intended performance standards for IT operations and applications within a service.
At SLOconf, Nobl9 announced updates including the following:
Improved calculation precision
Query checker
Metric Health Notifier
Generative AI preview
How Nobl9 Is Advancing SLO
A foundational element of SLO is being able to measure the state of a service.
Kit Merker, chief growth officer at Nobl9, told ITPro Today that the math for SLOs is rich and complex, and there are many ways to get it wrong. He noted that SLOs might seem pretty straightforward to calculate on the surface, but there are a lot of nuances that come up in pragmatic, real-world situations involving running systems, transactions, and outages.
For example, Merker said that some SLOs are based on data that is sparse. That is, the data to calculate an SLO may arrive irregularly and from different sources. Another challenge can be with metrics systems that use constantly increasing counters that occasionally reset, often when the metrics system or the service it's monitoring restarts or is relocated on infrastructure. The resets can result in brief inaccuracies in the SLO.
"We are working to detect and handle these situations in a better way," Merker said.
Nobl9 is also updating its platform with a Metric Health Notifier. Merker noted that outages and downtime are the types of things that SREs and ITOps professionals deal with all the time. "Not only can the services you are running go down, but also the metrics and observability systems you use to measure them can go down," he said. "When your telemetry goes down, your service-level objectives can't be calculated."
That's where the new Metric Health Notifier service comes into play. Merker explained that Nobl9's Metric Health Notifier gives SLO admins and users a way to be notified when their observability and telemetry systems go down, or if service-level indicators stop flowing into Nobl9, for any reason.
Generative AI Coming to SLO
Like nearly every other sector of enterprise IT, generative IT is coming to SRE and SLOs.
"Generative AI is a new and emerging technology, but we see the potential to use it in various ways," Merker said.
At SLOconf, Nobl9 previewed theslogpt.ai service, providing an early demonstration of how the same technology that enables ChatGPT can help IT operations. Merker said that a user can ask slogpt.ai "is my service reliable?" and, based on the prompt and the settings of the SLO, get a mostly correct answer.
slogpt
The beta of slogpt.ai usesGoogle Vertex AI and the preview of PaLM 2 alongside Nobl9 Service Level Analyzer to generate an interactive SLO from a screenshot, he said.
"In the future, generative AI may help us improve query syntax validation, identify anomalies in source data, and help new users better understand how each SLO works and how to improve the reliability of their systems," Merker said.
About the Author
You May Also Like