Is InfluxDB Getting Slower as Your Device Fleet Grows?
If you've hit InfluxDB's cardinality wall, you're not alone.
You started with a clean InfluxDB deployment. A handful of sensor types, a few thousand devices, queries running fast. Then your deployment grew: more device models, more geographic regions, more metadata dimensions. And something changed: queries slowed, memory usage climbed, your ops team started getting paged.
That's cardinality blowup. It's one of the most well-documented pain points in the time-series database community, and it's not a configuration problem. It's an architectural one.
This page is an honest comparison of CrateDB and InfluxDB for IoT and industrial sensor data workloads. We'll explain where each database excels, where each one struggles, and how to decide which is right for your situation.
What is high cardinality, and why does it matter for IoT?
In time-series databases, cardinality refers to the number of unique tag combinations used to identify your data streams. In InfluxDB's data model, every unique combination of tag values creates a new series. This is fine at small scale, but IoT deployments are cardinality multipliers by nature:
- Each device ID is a tag value
- Each sensor type, firmware version, location, and customer tenant is a tag value
- Cross those dimensions together and your series count explodes
A deployment with 50,000 devices × 10 sensor types × 200 customer tenants creates 100 million potential tag combinations.
InfluxDB stores an in-memory index for every unique series, and that index is what causes memory pressure, query degradation, and eventual out-of-memory failures as your deployment scales.
CrateDB is built on a fundamentally different architecture. It uses a columnar store with automatic indexing on every field, with no in-memory series index. Adding a new dimension — a new device type, a new metadata field — doesn't change the performance profile. You can query freely across any combination of dimensions without pre-planning your tag schema.
Side-by-side comparison
| CrateDB | InfluxDB OSS / Cloud | |
| Query language | Standard SQL | Flux (InfluxDB 3.x: SQL via Flight SQL) |
| High-cardinality handling | No cardinality limits. Columnar architecture | Cardinality limits are a hard architectural constraint in v1/v2 |
| Multi-model support | Time-series, JSON, relational, vector, full-text, geospatial. In one engine | Primarily time-series; JSON/document support limited |
| Schema flexibility | Dynamic schema: add fields without DDL or downtime | Tags/fields must be defined; schema changes require planning |
| Horizontal scalability | Shared-nothing distributed cluster, automatic sharding and replication |
OSS: single-node only. |
| BI & analytics tool compatibility | PostgreSQL wire protocol. Works with Grafana, Tableau, Metabase, DBeaver, etc. | Native connectors required; broader tool support in v3 |
| Vector/AI workloads | Native vector search. No separate vector DB needed | Not supported |
| Open source | Yes (Apache 2.0 core) | OSS v1/v2 available v3 source-available with restrictions |
| Self-managed deployment | Docker, Kubernetes, bare metal, cloud VMs, edge | Docker, Kubernetes |
| Managed cloud | CrateDB Cloud (free tier available) | InfluxDB Cloud |
Where InfluxDB wins
InfluxDB is a mature, purpose-built time-series database and for many workloads it remains an excellent choice:
- Simple, single-dimensional monitoring workloads. If you're storing metrics for a fixed set of servers or services with a stable, low-cardinality tag schema, InfluxDB v1/v2 is fast and well-understood.
- Deep Telegraf ecosystem. InfluxDB's Telegraf agent has hundreds of input plugins and is the default choice for infrastructure and DevOps monitoring pipelines.
- InfluxQL familiarity. Teams already fluent in InfluxQL and Flux have existing skills and tooling that don't transfer easily.
Where CrateDB wins
CrateDB was designed from the ground up for the kind of data that breaks InfluxDB:
- Unbounded device fleets. Whether you're managing 10,000 or 10 million devices, CrateDB's columnar architecture doesn't degrade as device count grows. Cardinality is not a concept CrateDB needs to manage.
- Rich, evolving sensor metadata. Industrial IoT data rarely stays flat: firmware versions, calibration coefficients, GPS coordinates, nested JSON payloads from edge devices. CrateDB stores and queries all of it without a separate database or ETL step.
- Ad-hoc analytics alongside ingestion. CrateDB handles high-write throughput and complex analytical queries simultaneously. You can run cross-device aggregations, multi-dimensional slice-and-dice queries, and dashboard queries on live data without moving data to a separate analytics system.
- SQL across your entire data model. Your data engineers, analysts, and BI tools already know SQL. CrateDB speaks PostgreSQL wire protocol, which means Grafana, Metabase, Tableau, Superset, and virtually any BI tool connects out of the box: no proprietary query language to learn.
- One system instead of three. Teams running InfluxDB often end up pairing it with a document store for device metadata and a search engine for full-text queries. CrateDB replaces all three.
Real workloads where customers switched
TGW Logistics manages over 900,000 sensors across distribution centers worldwide, processing 30,000 messages per second. Their previous stack couldn't handle the combination of high-frequency sensor data and the multi-dimensional analytics required for warehouse intelligence. CrateDB now serves as their single database for ingestion, analytics, and operational queries.
SPGo! Business Intelligence captures data from 30,000 sensors per mine, representing 750 million records per day. They rely on CrateDB for real-time predictive maintenance analytics across a schema that grows as new sensor types are added, without schema migrations or performance degradation.
Gantner Instruments uses CrateDB to analyze data from hundreds of thousands of industrial sensors in real time, with sub-millisecond frontend query latency for power, temperature, pressure, speed, and torque measurements.
Which should you choose?
Choose InfluxDB if:
- Your workload is primarily infrastructure/DevOps monitoring with a stable, low-cardinality schema
- You're heavily invested in the Telegraf ecosystem
- Your team is fluent in Flux and prefers InfluxDB's purpose-built time-series model
- You have no plans to query sensor data alongside JSON metadata, full-text, or vector embeddings
Choose CrateDB if:
- You're managing large, growing device fleets where cardinality is or will become a problem
- Your sensor data has rich, evolving metadata that lives alongside the time-series readings
- You need analysts and BI tools to query the same data your ingestion pipeline writes to
- You're building toward AI-driven applications (anomaly detection, predictive maintenance) and don't want to maintain a separate vector store
- You want standard SQL and PostgreSQL compatibility across your entire data platform
Also evaluating ClickHouse? If you're outgrowing Postgres and InfluxDB is just one option on your shortlist, read our CrateDB vs. ClickHouse comparison to see how the two databases compare for analytics-heavy workloads.
Try CrateDB on your data
The fastest way to see if CrateDB handles your workload is to run it yourself. Start with a free CrateDB Cloud instance — no credit card required — and run your existing queries against your actual data.