Monitoring CrateDB Cloud clusters

Matej · March 7, 2023, 9:34am

This tutorial demonstrates how you can monitor your CrateDB Cloud cluster using the exposed Prometheus metrics.

The visualization tool Grafana is used with Prometheus to scrape the API endpoint that exposes metrics and visualize them. The returned metrics are a sum of all the clusters in the specified organization.

Prerequisites

Both Prometheus and Grafana are run as Docker containers in this tutorial, so you need Docker present in your system.

Cluster Deployment

The first step is to sign up in the Cloud Console if you haven’t done so yet. After that, you can deploy your cluster.

Prometheus

Prometheus is used to scrape the CrateDB Cloud API endpoints for available metrics and serve as a data source for Grafana.

First, you need to save the following configuration .yaml file in your system:

# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  - job_name: "cratedb"
    metrics_path: '/api/v2/organizations/{{ORGID}}/metrics/prometheus/'
    basic_auth:
      username: '{{APIKEY}}'
      password: '{{SECRET}}'
    static_configs:
      - targets: ["console.cratedb.cloud"]

Substitute the ORGID with the ID of your organization. It can be found in the Settings page in the CrateDB Cloud Console:

Your API credentials can be found on the Account page. Make sure to store your API secret securely, as it’s shown only once when you create your API key.

Once you have added your Organization ID and API credentials, execute the following command to create a Prometheus instance:

docker run -d --name prometheus -v /Users/crate/prometheus.yml:/etc/prometheus/prometheus.yml -p 9090:9090 prom/prometheus

It’s important to use the full path to the .yml file in the docker run command

This will start the Prometheus instance exposed on port 9090 . You can verify it’s running correctly by visiting http://localhost:9090/ . On the Status -> Targets page in the top menu, you should see the following:

There should be an endpoint with your Organization ID with state UP . This means that Prometheus is able to connect to the API and is scraping the available metrics.

Available Metrics

Most metric semantics are self-explanatory. This list is not exhaustive, and new metrics can be added at any point in the future. All metrics are per node.

Metric	Type	Description
container_cpu_usage_seconds_total	Counter	CrateDB CPU usage, in seconds.
container_fs_reads_bytes_total	Counter	Number of bytes read per disk
container_fs_writes_bytes_total	Counter	Number of bytes written per disk
container_memory_usage_bytes	Gauge	Memory usage
container_network_receive_bytes_total	Counter	Network ingress traffic
container_network_transmit_bytes_total	Counter	Network egress traffic
crate_circuitbreakers	Gauge	Circuit breaker stats for crate per breaker
crate_cluster_state_version	Gauge	Info about the cluster’s state
crate_connections	Gauge	Number of connections per protocol
crate_node	Gauge	Shard statistics
crate_query_failed_count	Counter	Number of failed queries per type (i.e. Insert/Select/Update/…)
crate_query_sum_of_durations_millis	Counter	Sum of the durations of all queries per query type
crate_query_total_count	Counter	Total number of queries per type
crate_ready	Gauge	An indicator if this CrateDB node is up-and-running
crate_threadpools	Gauge	Thread pool statistics, per pool
jvm_*	Gauge	Various JVM statistics

Grafana

Grafana doesn’t need any special configuration. You can run it either in a Docker container or as a local installation, it doesn’t matter for this use case. Follow the Grafana documentation and use your preferred method.

We used Docker image to run grafana:

docker run -d -p 3000:3000 grafana/grafana-oss

By default, Grafana is exposed on port 3000. Go to http://localhost:3000/ to access it.

Data source

Now you can add Prometheus as a data source in Grafana under Configuration -> Data sources. Choose Prometheus, use http://host.docker.internal:9090/ as the URL, and leave the rest as default:

Dashboard

All that’s left is to create a dashboard or import one that we prepared for you. Simply save this snippet as .json and import it under Dashboards -> New -> Import. Click the “Upload JSON file” and choose the file. The dashboard will be called “CrateDB Cluster Monitoring”.

The dashboard displays the following metrics. The values are aggregated from all the running clusters in your organization:

Global stats:
- Number of nodes
Clusters stats:
- Type and number of open connections to your clusters
- SELECT queries per second
- INSERT queries per second
- CPU usage (Cores)
- Memory usage
- File system writes
- File system reads
Query stats:
- Error rate along with the type of failed query
- Average query duration along with the type of query
- Queries per second along with the type of query

Topic	Replies	Views
Monitoring a self-managed CrateDB cluster with Prometheus and Grafana Tutorials integration , monitoring , data-visualization , prometheus , grafana	1916	September 30, 2022
Live Demo - Mastering Monitoring: A Step-by-Step Guide to Using Prometheus and Grafana with CrateDB Cloud Events	344	May 8, 2023
Today's Live Demo on Mastering Monitoring: A Step-by-Step Guide to Using Prometheus and Grafana with CrateDB Cloud Events	281	May 31, 2023
New CrateDB Cloud/Edge Feature: Consumable metrics CrateDB Cloud	472	February 28, 2023
New CrateDB Cloud/Edge Feature: Improved Cluster Details View CrateDB Cloud	494	March 21, 2023