In an IT context where the volume of information is exploding, a powerful metrics tool is essential. But traditional monitoring methods sometimes struggle to provide sufficiently granular and responsive visibility. It is in this context that Prometheus stands out as a cornerstone of modern observability.

In this article, we take a technical look at Prometheus, its strengths and uses. We’ll also look at the advantages of combining it with our open-source hypervisor solution, Canopsis.

Solution overview

Prometheus is an open source monitoring solution focused on time-series metrics. Initiated in 2012 at SoundCloud, it subsequently became a key component of the Cloud Native ecosystem. In 2016, this initiative joined the Cloud Native Computing Foundation (CNCF).

Its main architectural principles are :

Prometheus relies on a multitude of exporters to capture and monitore metrics from each layer:

In most cases, it also integrates with Grafana to visualize and build interactive dashboards for IT monitoring:

Prometheus technical and functional benefits

Here’s what Prometheus brings to the table, over and above its core capabilities of monitoring.

Technical advantages

The use of labels enables metrics to be sorted, aggregated and broken down along various axes (instance, version, region, service type). Very useful for distributed infrastructures!

Prometheus is lightweight by nature. However, it can be scaled up via federated architectures, clusters, remote write or third-party ingestion components. Analysts, operators and engineers can therefore distribute loads according to geographical zones or functions.

Thanks to the combination of fast local storage, fine-tuned alert rules and low collection latency, Prometheus provides early, reliable alerts.

Many platforms integrate with Prometheus to ingest or export metrics. For example, Elastic offers a Prometheus integration that enables metrics to be collected via exporters, Remote-Write or PromQL queries.

Functional advantages

For mission-critical operations, being able to observe CPU load, latency and application or infrastructure errors in near-real time is a strategic advantage!

Dashboards (Grafana or others) can also be used to analyze trends, forecast peak loads, detect anomalies, manage capacity or model performance for SLAs.

Open source = no license fees. The cost is in sizing (storage, instances), but the flexibility enables optimized use.

What Prometheus can do in combination with Canopsis

Now that we’ve seen what Prometheus is and its strengths, let’s look at what we can achieve by combining it with our open-source hypervisor solution like Canopsis.

Synergy between events and metrics

Canopsis aggregates events from supervisors, repositories, logs, CMDB… Prometheus provides continuous metrics: latency, error rates and resource utilization.

By combining the two, we can not only detect alerts via Canopsis, but also have access to the metric context that generated them, enabling faster diagnosis.

Prometheus connector for Canopsis

Prometheus, coupled with Alertmanager, generates alarms that are received by the dedicated Canopsis connector. It takes the form of an Alertmanager webhook receiver. Received webhooks are transformed into Canopsis events.

Real-life use cases

Benefits for Canopsis users

To conclude on Prometheus monitoring… and Canopsis!

Prometheus is therefore an extremely powerful monitoring tool for collecting, storing and querying high-frequency metrics, via exporters. With Alertmanager, its alarms can be transmitted to a hypervision tool like Canopsis, which will reduce the “noise”, thus avoiding drowning users in irrelevant alarms.

From Prometheus metrics to Canopsis service weather: the Prometheus connector helps to calculate service weather business indicators, by combining the metrics received with other sources of information.

If you use Canopsis, Prometheus integration is a natural step for :

Want to find out more about what Prometheus and Canopsis can do for your IS? Come and talk to our Open Source experts!