Live Stream on Jan 23rd: Unlocking Real Time Insights in the Renewable Energy Sector with CrateDB

Register now
Skip to content
Blog

Monitoring CrateDB with Prometheus and Grafana

Note: This blog post is a bit old, and some sections may be out to date. We have published updated instructions for monitoring CrateDB with Prometheus and Grafana here.

Prometheus is an open-source systems’ monitoring and alerting toolkit. It is very useful for monitoring machine metrics and service-oriented architectures, supporting multi-dimensional data collection and querying. If paired with Grafana, the open-source visualization tool that we introduced in this previous post, one can build very nice monitoring dashboards.

In this blogpost, I will show you how to:

  • Run CrateDB, Prometheus, and Grafana with docker-compose
  • Enable JMX monitoring in CrateDB
  • Set up a monitoring dashboard in Grafana, giving you the option of importing a complete pre-built dashboard

Let's dive into it!

Note: This blog post uses CrateDB 4.5.0, Prometheus 2.26.0 and Grafana 7.5.2.  

Starting Prometheus, Grafana, and CrateDB with Docker (and JMX monitoring)

To run our tools, we will be using Docker. We will build our containers by using docker-compose, which comes in very handy to define and run multi-container Docker applications very quickly.

Docker-compose comes pre-installed with your Docker installation. If you don't have Docker installed, you can download it here — just follow the instructions that will pop up.

Once your installation of Docker is complete, create a working directory and navigate there with your terminal. There, create a docker-compose.yml file with the following content:

version: "3.9"
services:
  cratedb:
    image: "crate"
    volumes:
      - ./crate-jmx-exporter-1.0.0.jar:/jmxdir/crate-jmx-exporter-1.0.0.jar
    ports:
      - "4200:4200"
      - "7071:7071"
    environment:
      CRATE_JAVA_OPTS: "-javaagent:/jmxdir/crate-jmx-exporter-1.0.0.jar=7071 -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false"
  prometheus:
    image: "prom/prometheus"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
    ports:
      - "9090:9090"
  grafana:
    image: "grafana/grafana"
    ports:
      - "3000:3000"

(You can create a .yml file through multiple methods. In macOS, you can open a terminal text editor like nano, saving the file with the .yml extension. You can also use a text editor.)

Now, let's move on to the next step. As we briefly mentioned in the introduction, to scrape CrateDB we're making use of the Java Management Extensions (JMX) and the CrateDB JMX monitoring feature.

To set it up, download the latest JMX monitoring .jar here. Click on /1.0.0; in this blogpost, I’m using the file called crate-jmx-exporter-1.0.0.jar. Then, move the .jar file into the working directory you created before.

Note: let's take a closer look at what's happening here. With the help of the .yml file we defined earlier, we are going to start three containers at the same time (CrateDB, Prometheus, and Grafana) and expose their relevant ports. Besides, we are including the JMXExporter .jar into the Docker container, by using the volumes directive. Then, we are using the -javaagent directive to enable the JMXExporter and configure it to use port 7071. The other arguments you see in the file are needed to fully enable JMX Monitoring.

One last configuration item. Before we can start Prometheus, we need a new configuration file for it. In order to do that, create a new .yml file named prometheus.yml in your working folder, and paste the following content in it:

global:
  scrape_interval: 15s
  scrape_timeout: 10s
  evaluation_interval: 15s
scrape_configs:
- job_name: prometheus
  honor_timestamps: true
  metrics_path: /metrics
  scheme: http
  static_configs:
  - targets:
    - cratedb:7071

Now, we are ready to start all the containers. Navigate to your working folder with the terminal and run the following command:

docker-compose up

Wait a few seconds for the process to end. When it finishes, you can access CrateDB, Prometheus, and Grafana.

1

2-e1617759775539

If you click on ”Status -> Targets”, you will see that CrateDB is already set up as an endpoint:

3-e1617759870183

4

Note: if this is your first time using Grafana, first fill “admin” in both the username and the password field. You can define your credentials on the next screen.

5-e1617760080667

Setting up a pre-built monitoring dashboard in Grafana

Now that we have all our tools ready, let’s set up a dashboard to monitor our cluster in Grafana using Prometheus as the data source.

In Grafana, go to “Configuration -> Data sources”:

6

Now, click on ”Add data source”:

7

And select ”Prometheus”:

8

A configuration page will show up. Fill up the following fields:

You can leave all the other fields with the default configuration.

9

When you’re done, scroll to the end of the page, and click on “Save & Test”. If everything goes well, you’ll see a message saying “Data source is working”.

10

Now, let's set up our dashboard. On the left menu, click on "Create -> Dashboard":

11

You will see a screen like the one below. Click on "Add new panel". (Panels are the building blocks of Grafana's dashboards).

12

The configuration screen for your new panel will open up. Here, you can define all the elements of your panel, like its name, queries, type of visualization, and so on.

13

To learn about all the possibilities that Grafana offers, check out their docs. 

You can experiment by building your own panels. However, if you want to speed up the process, I am going to give you the option of importing a pre-built monitoring dashboard.

In order to do so, first download this JSON file.

Then, on the Grafana home page, click on “Create -> Import”:

14

Now, press on “Upload JSON file”, and select the JSON file you just downloaded.

15

To finish, click on “Import”.

16

Voilà! You now have a complete dashboard monitoring CrateDB in real-time. Your panels show:

  • Queries per second
  • Average query duration over the last minute
  • Query error rate
  • GC rates
  • Number of shards
  • Circuit breaker memory in use
  • CPU usage (seconds)
  • JMV memory in use

17

PS: This dashboard is similar to what we're using for monitoring real clusters in production. This is how it looks like for one of our customers:

18