Real-time Analytics Database
CrateDB is an open source, multi-model and distributed database that offers high performance, scalability and flexibility. It can ingest, store, and analyze large amounts of data in real time to make data-driven decisions and respond to dynamic trends quickly.
Open source real-time analytics database, all with SQL
Hyper-fast. Queries in milliseconds.
/* Based on IoT devices reports, this query returns the voltage variation over time
for a given meter_id */
WITH avg_voltage_all AS (
SELECT meter_id,
avg("Voltage") AS avg_voltage,
date_bin('1 hour'::INTERVAL, ts, 0) AS time
FROM iot.power_consumption
WHERE meter_id = '840072572S'
GROUP BY 1, 3
ORDER BY 3
)
SELECT time,
(avg_voltage - lag(avg_voltage) over (PARTITION BY meter_id ORDER BY time)) AS var_voltage
FROM avg_voltage_all
LIMIT 10;
+---------------+-----------------------+
| time | var_voltage |
+---------------+-----------------------+
| 1166338800000 | NULL |
| 1166479200000 | -2.30999755859375 |
| 1166529600000 | 4.17999267578125 |
| 1166576400000 | -0.3699951171875 |
| 1166734800000 | -3.7100067138671875 |
| 1166785200000 | -1.5399932861328125 |
| 1166893200000 | -3.839996337890625 |
| 1166997600000 | 9.25 |
| 1167044400000 | 0.4499969482421875 |
| 1167174000000 | 3.220001220703125 |
+---------------+-----------------------+
/* Based on IoT devices reports, this query returns the voltage corresponding to
the maximum global active power for each meter_id */
SELECT meter_id,
max_by("Voltage", "Global_active_power") AS voltage_max_global_power
FROM iot.power_consumption
GROUP BY 1
ORDER BY 2 DESC
LIMIT 10;
+------------+--------------------------+
| meter_id | voltage_max_global_power |
+------------+--------------------------+
| 840070437W | 246.77 |
| 840073628P | 246.69 |
| 840074265G | 246.54 |
| 840070238E | 246.35 |
| 840070335K | 246.34 |
| 840075190M | 245.15 |
| 840072876X | 244.81 |
| 840070636M | 242.98 |
| 84007B113A | 242.93 |
| 840073250D | 242.28 |
+------------+--------------------------+
/* Based on device data, this query returns the average
* of the battery level for every hour for each device_id
*/
WITH avg_metrics AS (
SELECT device_id,
DATE_BIN('1 hour'::INTERVAL, time, 0) AS period,
AVG(battery_level) AS avg_battery_level
FROM devices.readings
GROUP BY 1, 2
ORDER BY 1, 2
)
SELECT period,
t.device_id,
manufacturer,
avg_battery_level
FROM avg_metrics t, devices.info i
WHERE t.device_id = i.device_id
AND model = 'mustang'
LIMIT 10;
+---------------+------------+--------------+-------------------+
| period | device_id | manufacturer | avg_battery_level |
+---------------+------------+--------------+-------------------+
| 1480802400000 | demo000001 | iobeam | 49.25757575757576 |
| 1480806000000 | demo000001 | iobeam | 47.375 |
| 1480802400000 | demo000007 | iobeam | 25.53030303030303 |
| 1480806000000 | demo000007 | iobeam | 58.5 |
| 1480802400000 | demo000010 | iobeam | 34.90909090909091 |
| 1480806000000 | demo000010 | iobeam | 32.4 |
| 1480802400000 | demo000016 | iobeam | 36.06060606060606 |
| 1480806000000 | demo000016 | iobeam | 35.45 |
| 1480802400000 | demo000025 | iobeam | 12 |
| 1480806000000 | demo000025 | iobeam | 16.475 |
+---------------+------------+--------------+-------------------+
/* To identify gaps on the readings, the following queries generates a series
* and by joining it with the original data, you can spot any gap */
with avg_battery AS (
SELECT battery_level, time
FROM devices.readings
WHERE device_id = 'demo000007'
AND time > 1480118400000
AND time < 1480301200000
ORDER BY 2
),
all_hours AS (
SELECT generate_series(1480118430000,1480301200000,'30 second'::interval) AS generated_hours
)
SELECT time, generated_hours, battery_level
FROM all_hours
LEFT JOIN avg_battery ON generated_hours = time
ORDER BY 2
LIMIT 20;
+---------------+---------------+---------------+
| time | hours | battery_level |
+---------------+---------------+---------------+
| 1480118430000 | 1480118430000 | 67 |
| 1480118460000 | 1480118460000 | 66 |
| 1480118490000 | 1480118490000 | 66 |
| 1480118520000 | 1480118520000 | 66 |
| 1480118550000 | 1480118550000 | 66 |
| 1480118580000 | 1480118580000 | 66 |
| 1480118610000 | 1480118610000 | 65 |
| 1480118640000 | 1480118640000 | NULL |
| 1480118670000 | 1480118670000 | 65 |
| 1480118700000 | 1480118700000 | 65 |
| 1480118730000 | 1480118730000 | 65 |
| 1480118760000 | 1480118760000 | 65 |
| 1480118790000 | 1480118790000 | 65 |
| 1480118820000 | 1480118820000 | 65 |
| 1480118850000 | 1480118850000 | 65 |
| 1480118880000 | 1480118880000 | 65 |
| 1480118910000 | 1480118910000 | 65 |
| 1480118940000 | 1480118940000 | 65 |
| 1480118970000 | 1480118970000 | NULL |
| 1480119000000 | 1480119000000 | NULL |
+---------------+---------------+---------------+
/* Based on device data, this query returns the number of battery charges
* per day for a given device_id */
WITH aux_charging AS (
SELECT time,
DATE_BIN('P1D'::INTERVAL,time,0) AS day,
battery_status,
LAG(battery_status) OVER (PARTITION BY device_id ORDER BY time) AS prev_battery_status
FROM devices.readings
WHERE device_id = 'demo000001'
ORDER BY time
),
count_start_charging AS (
SELECT day, (case when battery_status <> prev_battery_status then 1 else 0 end) AS start_charging
FROM aux_charging
ORDER BY 1
)
SELECT day, sum(start_charging) as charges_number
FROM count_start_charging
GROUP BY 1
ORDER BY 1;
+---------------+---------------+
| count_charges | day |
+---------------+---------------+
| 2 | 1479168000000 |
| 4 | 1479254400000 |
| 2 | 1479340800000 |
| 10 | 1479427200000 |
| 7 | 1479600000000 |
| 8 | 1479686400000 |
| 6 | 1479772800000 |
| 11 | 1479859200000 |
| 5 | 1480032000000 |
| 7 | 1480118400000 |
| 6 | 1480204800000 |
| 10 | 1480291200000 |
| 3 | 1480464000000 |
| 3 | 1480550400000 |
| 7 | 1480636800000 |
| 2 | 1480723200000 |
+---------------+---------------+
Real-time query performance
Experience fast in-memory SQL query performance with CrateDB's parallel query processing and distributed columnar field caches. This feature enhances real-time analytics by optimizing query execution speed, ensuring swift data retrieval for immediate analysis and decision-making.
Horizontal scalability
Seamlessly scale CrateDB to manage the continuous influx of massive data from diverse sources. CrateDB's ability to scale horizontally across multiple nodes facilitates uninterrupted operations, allowing you to expand resources effortlessly to accommodate growing data needs, ensuring consistent performance for real-time analytics.
Distributed shared-nothing architecture
Real-time indexing
Ensure immediate data availability for queries, enabling swift access crucial for real-time monitoring, accelerated decision-making, and enhancing operational efficiency.
Fast data ingestion and processing
Ingest, store, and process millions of data points per second thanks to distributed processing, data partitioning, multithreaded design, and shared-nothing distributed architecture with masterless clustering.
Built-in high availability
Ensure ultimate reliability and non-stop performance with automatic replication and self-healing. Instant availability of ingested data ensures immediate query access, delivering responses in milliseconds for intricate ad-hoc queries across vast datasets, ensuring real-time insights.
Columnar storage for data aggregations
Real-time monitoring for smart buildings
Learn how O-CELL real-time monitoring solution helps to reduce the environmental impact of infrastructures with CrateDB.
Real-time analytics for video streaming
Learn how Bitmovin improves the streaming experience with real-time analytics.