Live Stream on Jan 23rd: Unlocking Real Time Insights in the Renewable Energy Sector with CrateDB

Register now
Skip to content
Resources > Academy > Advanced Time Series

Visualizing Time Series Data

Login or sign up for free CrateDB Academy: CrateDB Fundamentals

When working with time series data in CrateDB, visualization is a key aspect that aids in understanding the patterns and trends in your data. One of the most popular tools for this purpose is Grafana, which provides a powerful and intuitive platform for creating and sharing dashboards. In combination with CrateDB's robust handling of time series data, Grafana allows you to create dynamic visualizations that can provide valuable insights, making it easier to analyze and interpret your data. 

To visualize your data in Grafana, you'll first need to set up a data source. Select the PostgreSQL type as CrateDB is compatible with the PostgreSQL wire protocol. Next, enter the necessary parameters for your database. This includes information such as the host, database name, user, and password. This process connects Grafana to your database, allowing it to retrieve and display your data. 

 Now, let’s see how it works in practice: 

  1. Open Grafana and navigate to Connections, then 'Data Sources’. 
  2. This lists all data sources. The connection to CrateDB is established via the PostgreSQL wire protocol.
  3. The connection details contains the Host URL, Database name, Username, and Password. The parameters depend on whether you are connecting to a CrateDB Cloud instance or local instance.
  4. The TLS/SSL Mode is set to verify-full for a CrateDB Cloud cluster.
  5. Finally, we can test the connection to ensure that Grafana can communicate with CrateDB and save the connection configuration. 

Now, we are ready to explore an example dashboard and the queries used to create it. 

This is the example dashboard for the weather data. The first panels tell us how many entries are in the dataset, how many cities are covered, and when the first and last measurements took place. If we go to inspect, query, and move to the query tab we can see which query is used to generate this information. In the first panel, we show the number of measurements as a count of all timestamps and the number of cities as a count of distinct locations. 

The table on the dashboard tells us for each location what was the highest recorded temperature and when this temperature was recorded. Check the query that generates this data. In the SELECT statement, we use the max aggregation function to find the highest temperature and max_by to find a timestamp for this value. The results are also grouped by location. 

The graph next to the table tells us how the interpolated temperature compares to the real values. It is a common problem when dealing with timeseries data to handle missing values. Such occurrences are common, perhaps due to a sensor malfunction or disconnection. To address this, the missing values need to be filled in. CrateDB can employ another useful tool: window functions paired with the IGNORE NULLS feature. Within a Common Table Expression, we utilize window functions to spot the next and prior non-null temperature recordings, and then compute the arithmetic mean to bridge the gap. You can see how the query looks in the query window. 

The next graphs show rolling averages. They are crucial in time-series analysis because they help smooth out short-term fluctuations and reveal underlying trends by averaging data points over a specified period. If you again check the query window for this graph you can see the query for computing rolling averages. Rolling averages are computed for all metrics: temperature, humidity, and wind speed by using the window of the last 10 and 20 readings. 

Last, but not least, in CrateDB you can effectively query geospatial data and this is shown in the last piece of this dashboard. The map shows all weather stations that are within a radius of 1000km from Berlin. Let’s check the query itself that produces the data: to show points on the map we need latitude and longitude information from each point, which is easy to get due to existing scalar functions in CrateDB. In the WHERE clause, we use the distance function to filter only those points that satisfy our condition. The result illustrates four locations.  

With this, we are closing our overview on time series visualization. You have learned how to use Grafana with CrateDB to query, visualize and further analyze your time series data. 

Take this course for free