The Guide for Time Series Data Projects is out.

Download now
Skip to content
Product > Data models

Time Series Data

CrateDB allows to query complex time series data in milliseconds, while leveraging the simplicity of SQL. You can easily enrich your time series data with other types of data, handle high ingest rates, and store years of history.
 

Time series data querying with SQL

Hyper-fast. Results in milliseconds.

 

        

/* Based on device data, this query returns the average
 * of the battery level for every hour for each device_id
 */
WITH avg_metrics AS (
    SELECT device_id,
       DATE_BIN('1 hour'::INTERVAL, time, 0) AS period,
       AVG(battery_level) AS avg_battery_level
    FROM devices.readings
    GROUP BY 1, 2 
    ORDER BY 1, 2
)
SELECT period,
       t.device_id,
       manufacturer,
       avg_battery_level  
FROM avg_metrics t, devices.info i
WHERE t.device_id = i.device_id 
      AND model = 'mustang'
LIMIT 10;
        

+---------------+------------+--------------+-------------------+
|    period     |  device_id | manufacturer | avg_battery_level |
+---------------+------------+--------------+-------------------+
| 1480802400000 | demo000001 |    iobeam    | 49.25757575757576 |
| 1480806000000 | demo000001 |    iobeam    | 47.375            |
| 1480802400000 | demo000007 |    iobeam    | 25.53030303030303 |
| 1480806000000 | demo000007 |    iobeam    | 58.5              |
| 1480802400000 | demo000010 |    iobeam    | 34.90909090909091 |
| 1480806000000 | demo000010 |    iobeam    | 32.4              |
| 1480802400000 | demo000016 |    iobeam    | 36.06060606060606 |
| 1480806000000 | demo000016 |    iobeam    | 35.45             |
| 1480802400000 | demo000025 |    iobeam    | 12                |
| 1480806000000 | demo000025 |    iobeam    | 16.475            |
+---------------+------------+--------------+-------------------+
        

/* To identify gaps on the readings, the following queries generates a series
 * and by joining it with the original data, you can spot any gap */
with avg_battery AS (
  SELECT battery_level, time 
  FROM devices.readings
  WHERE device_id = 'demo000007' 
  AND time > 1480118400000 
  AND time < 1480301200000
  ORDER BY 2
),

all_hours AS (
  SELECT generate_series(1480118430000,1480301200000,'30 second'::interval) AS generated_hours
)

SELECT time, generated_hours, battery_level
FROM all_hours 
LEFT JOIN avg_battery ON generated_hours = time
ORDER BY 2
LIMIT 20;
        

+---------------+---------------+---------------+
|          time |         hours | battery_level |
+---------------+---------------+---------------+
| 1480118430000 | 1480118430000 |            67 |
| 1480118460000 | 1480118460000 |            66 |
| 1480118490000 | 1480118490000 |            66 |
| 1480118520000 | 1480118520000 |            66 |
| 1480118550000 | 1480118550000 |            66 |
| 1480118580000 | 1480118580000 |            66 |
| 1480118610000 | 1480118610000 |            65 |
| 1480118640000 | 1480118640000 |          NULL |
| 1480118670000 | 1480118670000 |            65 |
| 1480118700000 | 1480118700000 |            65 |
| 1480118730000 | 1480118730000 |            65 |
| 1480118760000 | 1480118760000 |            65 |
| 1480118790000 | 1480118790000 |            65 |
| 1480118820000 | 1480118820000 |            65 |
| 1480118850000 | 1480118850000 |            65 |
| 1480118880000 | 1480118880000 |            65 |
| 1480118910000 | 1480118910000 |            65 |
| 1480118940000 | 1480118940000 |            65 |
| 1480118970000 | 1480118970000 |          NULL |
| 1480119000000 | 1480119000000 |          NULL |
+---------------+---------------+---------------+
        

/* Based on device data, this query returns the number of battery charges 
 * per day for a given device_id */
WITH aux_charging AS (
    SELECT time, 
      DATE_BIN('P1D'::INTERVAL,time,0) AS day, 
      battery_status, 
      LAG(battery_status) OVER (PARTITION BY device_id ORDER BY time) AS prev_battery_status
    FROM devices.readings
    WHERE device_id = 'demo000001'
    ORDER BY time
  ),

count_start_charging AS (
    SELECT day, (case when battery_status <> prev_battery_status then 1 else 0 end) AS start_charging
    FROM aux_charging
    ORDER BY 1
  )

SELECT day, sum(start_charging) as charges_number
FROM count_start_charging
GROUP BY 1
ORDER BY 1;
        

+---------------+---------------+
| count_charges |           day |
+---------------+---------------+
|             2 | 1479168000000 |
|             4 | 1479254400000 |
|             2 | 1479340800000 |
|            10 | 1479427200000 |
|             7 | 1479600000000 |
|             8 | 1479686400000 |
|             6 | 1479772800000 |
|            11 | 1479859200000 |
|             5 | 1480032000000 |
|             7 | 1480118400000 |
|             6 | 1480204800000 |
|            10 | 1480291200000 |
|             3 | 1480464000000 |
|             3 | 1480550400000 |
|             7 | 1480636800000 |
|             2 | 1480723200000 |
+---------------+---------------+
        

/* Based on device data, this query returns the average of the battery temperature
 * for each OS version */
SELECT device_info['os_name'], avg(battery_temperature)
FROM "devices"."readings"
GROUP BY 1
LIMIT 100;
        
 
+---------+--------------------------+
| os_name | avg(battery_temperature) |
+---------+--------------------------+
| 4.4.4   |        90.85937893039049 |
| 5.1.0   |        90.86754559738132 |
| 6.0.1   |        90.84230101265824 |
| 5.0.0   |        90.8574802739726  |
+---------+--------------------------+

Columnar storage

CrateDB can easily accommodate hundreds of columns in a single table. It uses a columnar storage format, which is highly efficient for time series data storage. Columnar data storage allows for faster query performance, especially when aggregating data over long periods or subsets of the data.

High cardinality

CrateDB offers robust support for time partitioning. It enables data to be stored long-term without any need for aggregation or down-sampling. This is crucial because the original, un-aggregated data often contains granular details that may be lost during aggregation. By preserving this level of detail, you get enhanced flexibility to revisit historical data for new insights, or conduct precise forecasting, which is crucial to strategic decision-making processes.

With CrateDB, you can also move your old partitions to slow but cheap spinning disks, while keeping the most recent data on fast SSDs, all while retaining fast query speed for most recent data, and not loosing any details in older data.

Time series functionality

CrateDB includes built-in time-series functionality, such as window functions and time-based indexes. These features make it easier to query and analyze the data, and can improve query performance.

  • LEAD and LAG functions, with IGNORE NULLS option, to fill and extrapolate missing data.
  • DATE_BIN function to resample the data and use the same intervals on the time axis.
  • WINDOW function.
  • JOIN operator to easily combine time series data in one table and corresponding metadata in another table. This avoids pushing too much data in your application.

Much more than a time series database

CrateDB offers a much broader scope; it is built for multiple types of data and you can combine them all into the same database: time series, JSON, vector, full-text, geospatial, BLOB and relational. This way, you can easily cover most of your needs, without investing in some new costly technology that needs complex maintenance and data synchronization.

SQL interface

CrateDB is a database supporting native SQL, which is the most familiar and powerful query language for many developers. This makes it easy to query and analyze without needing to learn a new query language or tool. Getting instant analytics is very straightforward. CrateDB also provides a HTTP endpoint to submit SQL queries.

Integrations

CrateDB integrates seamlessly with popular modern data visualization tools like Grafana, or libraries like Matplotlib. It also offers compatibility with the Java- and Python-based data ecosystem and corresponding libraries and frameworks such as pandas, Dask, or Spark, to facilitate efficient analysis and visualization of time series data. 

View a sample list of integrations >

Distributed architecture

CrateDB is a distributed database that can scale horizontally across multiple nodes. This makes it an ideal fit for time series workloads, which often involve handling large volumes of data from multiple different sources (sensors, IoT gateways, CRM, ERP...) that need to be ingested, enriched and processed on the fly to serve many simultaneous data consumers in real-time.

Open source

CrateDB open source licensing model leverages the power of an active community and brings your licensing costs down. Whether you want complete peace of mind with the SaaS model or deploy the product yourself, we have the right option for you if you decide to go for a fast and scalable open source time series database.

Time Series Video Tutorials

Tutorial #1

Time series data is everywhere

In this video, we explain what time series data is, how it is ordered and how it reveals patterns such as trends and irregularities. Watch to understand how CrateDB enables real-time complex queries, which is crucial for time series data analysis. 

Tutorial #2

Time series data in CrateDB

In this video, we explore how to query time series data with CrateDB and get results on large datasets in milliseconds, how to leverage the simplicity of SQL, combine multiple data types, handle high ingest rates, and store years of historical data.

Tutorial #3

Real-life challenges with time series

In this video, we explore the real-life challenges of time series data: volume, velocity, variety, and veracity. We also discuss how CrateDB effectively addresses these challenges with the proper data architecture.

Tutorial #4

Time series data modeling

In this video, we talk about time series data modeling. We see that CrateDB is not just a time series database, it also supports JSON, relational, geospatial, vector, and full-text data. Then, we showcase the use of standard SQL and how to query different data types.

Tutorial #5

Sharding and partitioning for time series

In this video, we demonstrate how partitioning subdivides time series data into smaller pieces, improving read and write performance. We also discuss sharding, a strategy that distributes data horizontally across multiple nodes to enhance the speed of queries and updates.

Guide for Time Series Data Projects

This comprehensive guide covers the different key aspects of time series data projects. It is divided in 3 distinct white papers. Part 1: data modeling, storage, lifecycle; Part 2: ingestion, indexing, analysis, and optimization; Part 3: visualization, and advanced analysis through machine learning.

Demo: Querying Heterogeneous Time-Series Data with SQL

In this video, discover how to effortlessly create a table and seamlessly import weather data into CrateDB. Witness the power of CrateDB's time-series query capabilities in action with the weather dataset, showcasing the dynamic schema flexibility. Dive deeper into CrateDB's multi-modal features with demonstrations on handling JSON, geospatial data, and conducting full-text searches.

Top 8 Most Asked Questions

Discover some of the common questions around time-series databases, including their advantages over traditional databases, best practices for managing them, and the industries that can benefit the most from their adoption. This white paper contains 8 essential things you need to know about time-series databases. 

How to Get the Most Out of Your Time Series Data

Additional time series topics

Documentation and tutorials

Successful companies using CrateDB as a modern time series database

"CrateDB is a highly scalable database for time series and event data with a very fast query engine using standard SQL."
Alexander Mann Digital Core TGW Logistics
cr-quote-img-white
"Thanks to CrateDB's great indexing, dedicated data types, and subsequent great performance, we could execute an event and data-driven architecture, with the performance and scalability necessary for storing time-series data over time."
Kristoffer Axelsson Principal Solution Architect Thomas Concrete Group Learn more
cr-quote-img-white
"We use CrateDB to process a lot of system metrics data in real time and to support streaming log analytics, which required search capability. I'm glad we can make queries with SQL. If we had to use Elasticsearch to answer new questions, we wouldn't be nearly as responsive to new requirements."
Joe Hacobian Infrastructure Engineer Digital Domain Learn more
cr-quote-img-white

101 for Time-Series databases

  • What is a time series database?
  • Key criteria for selecting a time series database.