Download the latest version of the CrateDB Architecture Guide

Download Now
Skip to content

IoT Database for Real-Time Analytics at Scale

Ingest, store, and analyze massive volumes of IoT data in real time using distributed SQL, without sacrificing flexibility or performance.

CrateDB is a distributed IoT database built for real-time analytics on high-volume, high-cardinality device data. It enables teams to query fresh and historical IoT data using SQL, at scale, without pre-aggregation or complex data pipelines.

What Is an IoT Database?

An IoT database is a database system built to handle continuous data streams generated by connected devices. Unlike traditional databases, it is optimized for workloads where data arrives constantly, queries must run on fresh data, and analytics must scale across millions of devices and dimensions.

An IoT database typically supports:

  • Continuous, high-throughput data ingestion

  • Time-series and event-based data models

  • High-cardinality attributes such as device IDs, locations, and firmware versions

  • Real-time and historical analytics in the same system

IoT databases power operational dashboards, monitoring platforms, and data-driven applications where insight must be available seconds after data is produced.

cr-quote-image

Why Traditional Databases Are Not Designed for IoT

IoT workloads introduce characteristics that many general-purpose databases were not built to handle.

  • Transactional databases prioritize individual record updates, not sustained ingestion and large analytical scans.

  • Time-series databases focus on append-only metrics and often struggle with high-cardinality, multi-dimensional queries

  • Data warehouses rely on batch ingestion pipelines that introduce latency and operational overhead

As IoT systems scale, these limitations make it difficult to analyze fresh device data quickly and consistently, especially as analytical requirements evolve.

An IoT database differs from time-series databases and data warehouses by combining high-throughput ingestion with real-time, SQL-based analytics in a single system.

cr-quote-image

Core Requirements of an IoT Database

A production-grade IoT database must support the full lifecycle of IoT analytics without constant tuning or architectural workarounds.

Key requirements include:

  • Continuous high-throughput data ingestion

  • Real-time queries on fresh and historical data

  • Efficient handling of high-cardinality dimensions

  • Horizontal scalability as device fleets grow

  • SQL-based analytics for flexibility and accessibility

  • Built-in resilience and fault tolerance

Meeting these requirements in a single system is critical as IoT analytics moves closer to operational and customer-facing applications.

IOT database: reference architecture

 

cr-quote-image

High-Velocity IoT Ingestion with Immediate Query Access

IoT systems generate continuous streams of sensor values, logs, and device events. An IoT database must ingest this data at high speed and make it available for querying immediately.

CrateDB captures IoT data as it arrives and indexes it automatically, enabling real-time queries without manual tuning or batch processing. Data remains query-ready from the moment it is ingested, allowing applications and dashboards to operate on live data with minimal latency.

This ingestion model supports modern IoT architectures built on streaming platforms, connectors, and direct application integrations.

cr-quote-image

Distributed SQL for Real-Time and Historical IoT Analytics

IoT analytics rarely stops at simple metrics. Teams need to combine time-series data with device metadata, customer context, and operational attributes.

CrateDB distributes SQL queries across the cluster, enabling fast analytics on large volumes of IoT data. Using SQL, teams can:

  • Join telemetry with device and asset metadata

  • Run window functions and time-based analysis

  • Perform real-time aggregations on fresh data

  • Analyze long-term historical datasets

  • Execute ad hoc queries without pre-aggregation

This unified analytical model supports dashboards, anomaly detection workflows, alerting systems, and predictive analytics directly on IoT data.

cr-quote-image

Multi-Model Data Support for Complex IoT Payloads

IoT payloads are rarely uniform. Devices may emit structured metrics, nested JSON objects, location data, logs, or AI features, often within the same system.

An IoT database must handle this diversity without forcing teams to manage multiple specialized engines.

CrateDB supports multiple data models within a single database:

All data types are accessible through SQL, enabling flexible analytics across heterogeneous IoT datasets.
cr-quote-image

Designed for Industrial, Hybrid, and Edge Environments

IoT systems operate across diverse environments, from centralized cloud platforms to factories, vehicles, and remote locations.

CrateDB adapts to these architectures by supporting cloud, on-premise, hybrid, and edge deployments with the same core capabilities. Lightweight edge nodes enable local data processing, while synchronization with central clusters supports global analytics and coordination.

This flexibility makes CrateDB well suited for industrial IoT, fleet operations, utilities, energy systems, and distributed sensor networks.

cr-quote-image

Geospatial and Mobility Analytics at Scale

Many IoT platforms track mobile or location-aware devices. Geospatial analytics is therefore a core requirement, not an add-on.

CrateDB includes native geospatial support that enables real-time and large-scale location analytics, including:

  • Geofencing and proximity analysis

  • Location-based aggregations

  • Fleet and asset movement analysis

  • Operational zone and coverage monitoring

These capabilities allow teams to analyze mobility patterns and spatial behavior directly within the IoT database.

cr-quote-image

Search and AI-Ready Analytics for IoT Data

IoT data often includes logs, metadata, and signals that benefit from search or AI-powered analysis. Adding separate search or vector engines increases complexity and latency.

CrateDB integrates full-text search and vector search directly into SQL, enabling:

  • Text search on logs and metadata

  • Similarity search on vector embeddings

  • Real-time feature extraction

  • Unified storage for sensors, logs, and AI features

This makes it possible to build predictive maintenance, anomaly detection, and digital twin systems without introducing additional data platforms.

cr-quote-image

Reliability and Security for Connected Systems

IoT platforms require continuous operation and strong protection for device and operational data.

CrateDB is built on a shared-nothing architecture with automatic replication and fault tolerance. It supports encryption at rest, TLS for data in transit, and role-based access control. CrateDB Cloud is certified under ISO 27001 and SOC 2 standards.

These capabilities support both production reliability and compliance requirements in connected environments.

cr-quote-image

Digital Twins and Contextualized IoT Analytics

Digital twins require more than raw telemetry. They rely on metadata that describes physical entities, data sources, quality constraints, and modeling assumptions.

CrateDB provides a unified repository for storing and retrieving digital twin metadata alongside time-series data. This enables real-time contextualization of telemetry and seamless transitions between technical and business views.

By integrating analytics and AI technologies directly on this data, teams can run complex algorithms, machine learning models, and statistical analysis without moving data between systems.

cr-quote-image

Open Source IoT Database with Flexible Deployment Options

CrateDB’s open source licensing model reduces infrastructure costs and benefits from an active community.

Whether you choose a fully managed cloud service or a self-managed deployment, CrateDB provides a flexible foundation for building fast, scalable IoT analytics systems.

cr-quote-image

When to Use CrateDB as your IoT Database

CrateDB is a strong fit when your IoT workloads involve:

  • High volumes of continuously ingested device data

  • High-cardinality identifiers and dimensions

  • Real-time analytics requirements

  • Operational dashboards and customer-facing applications

  • The need to simplify data architecture while scaling

By unifying ingestion, analytics, search, and AI readiness in a single system, CrateDB enables faster insights with lower architectural complexity.

cr-quote-image

Sensor data queries with SQL

Hyper-fast. Results in milliseconds.

 

        

/* Based on IoT devices reports, this query returns the voltage variation over time
for a given meter_id */ WITH avg_voltage_all AS ( SELECT meter_id, avg("Voltage") AS avg_voltage, date_bin('1 hour'::INTERVAL, ts, 0) AS time FROM iot.power_consumption WHERE meter_id = '840072572S' GROUP BY 1, 3 ORDER BY 3 ) SELECT time, (avg_voltage - lag(avg_voltage) over (PARTITION BY meter_id ORDER BY time)) AS var_voltage FROM avg_voltage_all LIMIT 10;
        

+---------------+-----------------------+
|          time |           var_voltage |
+---------------+-----------------------+
| 1166338800000 | NULL                  |
| 1166479200000 |   -2.30999755859375   |
| 1166529600000 |    4.17999267578125   |
| 1166576400000 |   -0.3699951171875    |
| 1166734800000 |   -3.7100067138671875 |
| 1166785200000 |   -1.5399932861328125 |
| 1166893200000 |   -3.839996337890625  |
| 1166997600000 |    9.25               |
| 1167044400000 |    0.4499969482421875 |
| 1167174000000 |    3.220001220703125  |
+---------------+-----------------------+
        

/* Based on IoT devices reports, this query returns the voltage corresponding to
the maximum global active power for each meter_id */ SELECT meter_id, max_by("Voltage", "Global_active_power") AS voltage_max_global_power FROM iot.power_consumption GROUP BY 1 ORDER BY 2 DESC LIMIT 10;
        

+------------+--------------------------+
| meter_id   | voltage_max_global_power |
+------------+--------------------------+
| 840070437W |                   246.77 |
| 840073628P |                   246.69 |
| 840074265G |                   246.54 |
| 840070238E |                   246.35 |
| 840070335K |                   246.34 |
| 840075190M |                   245.15 |
| 840072876X |                   244.81 |
| 840070636M |                   242.98 |
| 84007B113A |                   242.93 |
| 840073250D |                   242.28 |
+------------+--------------------------+

IoT Device Integration with Python

        

# Send IoT data to CrateDB with a simple HTTP request

import requests
import json

from datetime import datetime

def send_iot_data(device_id, temperature, humidity):

    url = "http://localhost:4200/_sql"
    headers = {'Content-Type': 'application/json'}
    payload = {

        "stmt": "INSERT INTO iot_data (device_id, temperature, humidity)
        VALUES ('{device_id}', {temperature}, {humidity})"
    }

    response = requests.post(url, headers=headers, data=json.dumps(payload))

    if response.status_code == 200:
        print("Data sent successfully!")
    else:
        print(f"Error sending data. Status code: {response.status_code}")

# Example Usage

send_iot_data('device-3', 22.5, 58.3)
        
        
        
Talk
AI-Big-Data-Expo-Amsterdam-2024-Smart-Transport-Talk-CrateDB
Smart Transport: How IoT Platforms Contribute for Real-Time E-Scooters Fleet Management

This talk given at the AI & Big Data Expo Amsterdam 2024 looks into specific problems faced in the management of e-scooter ride-sharing systems in major cities and demonstrates how through its IoT platform, CrateDB effectively tackles these challenges.

Talk
How-ABB-Ability-Genix-applies-AI-and-analytics-to-unlock-the-value-of-industrial-data-with-CrateDB-02
ABB: AI and Analytics applied to Industrial Data

In this talk, Marko Sommarberg, Lead Digital Strategy and Business Development at ABB, explaine how ABB Ability™ Genix applies AI and analytics to unlock the value of industrial data using CrateDB.

Talk
Not-All-Time-Series-Are-Equal_Challenges-of-Storing-and-Analyzing-Industrial-Data
TGW Logistics: Not All Time-Series are Equal

This talk at the IoT Tech Expo 2023 explores the complexities of industrial big data, characterized by its high variety, unstructured features, and different data frequencies. It also analyzes how these attributes influence data storage, retention, and integration when dealing with an IoT database.

Webinar
Stefan Asanin from CrateDB talking about AI strategies and technologies for IoT
Harnessing AI for IoT: Strategies & Technologies for Data Empowerment

Discover how a modern and sustainable approach to AI and IoT, built on principles of context, unification, and flexibility, can empower your business. Learn how companies are leveraging cutting-edge technologies to manage and analyze massive IoT datasets in real time, unlocking intelligent decision-making and operational efficiencies. 

Meetup
Notts IoT: RaspberryPi, Sensors and CrateDB
Notts IoT: Raspberry Pi, Sensors and CrateDB

In this video, CrateDB's Senior Product Evangelist, Simon Prickett, shows different types of sensors sending data to CrateDB from MicroPython, along with showing aggregations and dashboards. 

Webinar
SPGo! x CrateDB webinar blog post_3
How SPGo! builds apps for monitoring and predictive maintenance

With more than 23 years of experience in the mining and oil industry, SPGo! By PETROMIN, has developed a system that allows monitoring mining material conveyor belts with more than 40,000 sensors in real-time and 760 million records per day.

Tutorial
HiveMQ
Setup HiveMQ using CrateDB as consumer

This blog post gives you an overview of how to set up HiveMQ using CrateDB as a consumer.

White Paper
TGW-Warehouse-1
TGW Logistics redefines warehouse intelligence using CrateDB

TGW simplifies aggregating massive volumes of diverse data with CrateDB, gaining valuable insights to improve customer experience and competitive advantage

White Paper
Common Misconceptions about Industry 4
Common misconceptions about Industry 4.0 that manufacturers still believe

 While IIoT adoption does require a new approach to managing and analyzing data collected in real-time, this isn’t as difficult as many manufacturers believe. 

Want to know more?

Additional resources

FAQ

IoT data types primarily include structured data (e.g. sensor readings and timestamps), semi-structured data (e.g. JSON payloads), and unstructured data (e.g. images, video files, and audio recordings). The diversity of these data types requires a flexible database capable of efficiently managing and processing various formats to support different IoT applications. In CrateDB, you can store any type of data—structured, semi-structured, and unstructured—and leverage native SQL to query the data, as well as advanced time-series and search functionalities. CrateDB offers data collection and storage for any type of IoT data source: sensor data, historical data, geospatial data, operational parameters, and environmental conditions.

IoT data is stored in databases designed to handle large-scale, real-time data. Smart partitioning strategies are key to manage multiple periods of time efficiently. Depending on the specific application and requirements, these databases can be on-premises, cloud-based, or edge-based. CrateDB is highly flexible and can be deployed on private or public cloud, on-premises, or edge to meet your organization's unique needs. It also supports hybrid scenarios out of the box.

An IoT database architecture typically includes several key components: IoT devices that generate data, a network for transmitting this data, a storage system for data accumulation, and an analytics component for data processing. The architecture is designed to manage data flow from devices to storage and ultimately to analysis platforms, ensuring timely and accurate insights. CrateDB addresses these needs by providing easy and seamless integration with popular IoT stack software such as Kafka, Grafana, and Node-RED, leveraging native SQL and the PostgreSQL Wire Protocol.

An IoT database needs to manage large volumes of data, support a variety of data types, and offer real-time data processing capabilities. It's crucial for it to be highly scalable in order to handle the increasing number of devices and data streams. CrateDB meets these requirements by providing instant query availability after data ingestion, offering millisecond response times even for complex queries across billions of records. This ensures real-time insights and responsiveness, making CrateDB the right choice for IoT applications.

An IoT database must process continuous streams of sensor values, logs, and device events without delays. It typically uses distributed ingestion, batching, and optimized write paths to accept large volumes of data per second while keeping latency low. CrateDB ingests data at high speed and makes it queryable immediately thanks to automatic indexing and parallel processing.

Yes. A modern IoT database provides real time query capabilities on fresh data, allowing teams to run dashboards, alerts, anomaly detection, and monitoring workloads without waiting for batch jobs. CrateDB executes SQL queries on recent and historical data in milliseconds, even at large scale.

Many IoT systems require a mix of local processing at the edge and centralized analytics in the cloud. An IoT database should support both environments and allow data to synchronize across them. CrateDB runs on the edge, on premise, and in the cloud with the same SQL interface and multi model capabilities.

Connected devices often generate coordinates or movement patterns. An IoT database should store and query geospatial data natively, including points, shapes, and location based operations. CrateDB supports geo_point and geo_shape data types, along with functions for distance, geofencing, routing, and spatial analysis.

Yes. IoT platforms increasingly rely on AI for anomaly detection, forecasting, and predictive maintenance. An IoT database should store features, embeddings, and time series inputs used by models. CrateDB supports vector data types, similarity search, and real time feature extraction through SQL, which enables AI driven IoT applications.