Skip to content
Blog

The InfluxDB Cardinality Problem: Why High-Cardinality Industrial Data Breaks It

Your Telegraf pipeline into InfluxDB worked fine at 50 sensor types. At 200, dashboard queries started slowing. At 500 sensor types across multiple production sites, adjusting shard duration and tuning compaction settings stopped making a measurable difference. Teams evaluating InfluxDB alternatives at industrial scale often reach this point before they understand why.

The reason is series cardinality. At industrial scale, it is structural, not a configuration issue.

What InfluxDB's TSM model does

InfluxDB organizes data using measurements, tags, and fields. Tags are the metadata attached to each write: sensor type, asset ID, plant location, shift code. Fields are the values you record: temperature, pressure, vibration, flow rate.

The detail that defines the cardinality problem: InfluxDB's TSM (Time-Structured Merge Tree) storage engine assigns a separate internal series to every unique combination of tag values, not tag keys. Not "sensor_type is a dimension" but "every unique value of sensor_type, crossed with every unique value of asset_id, crossed with every unique value of plant_id."

In small deployments, this model performs well. In industrial deployments, the math changes everything.

The math at industrial scale

Consider a representative manufacturing dataset:

  • 900 sensor types across a packaging or automotive facility
  • 50 assets: machines, lines, production cells
  • 10 production plants

Series count: 900 × 50 × 10 = 450,000 internal series before adding any operational context.

Add shift codes (3 values): 1.35 million series.

Add production line variants (5 values): 6.75 million series.

InfluxDB's in-memory index maintains a lookup from every tag value combination to its corresponding series key. As series count climbs past hundreds of thousands, memory usage grows proportionally to the index size. Query latency scales with the number of series a query must resolve across, not only with the number of data points in those series.

A query asking for average vibration by sensor type and asset over the last 4 hours must locate and aggregate results across every relevant series in the index before returning an answer. At 450,000 series, that resolution step has weight. At 6.75 million, it dominates query time.

Why configuration changes do not fix this

Teams hitting this ceiling typically try increasing RAM, adjusting shard duration, reducing retention windows, and tuning compaction aggressiveness. These measures can delay the onset of degradation. They do not change the root cause.

InfluxDB assigns series at write time, the moment data first arrives with a new tag value combination. There is no configuration option that changes how series are created. The only way to reduce series count is to remove tag dimensions from the schema, which means losing the ability to query along those dimensions.

The diagnostic test: if your query latency scales with the number of unique sensor type or asset combinations in your dataset, and not with the number of rows those queries scan, you have reached the cardinality ceiling. The storage model is doing exactly what it was designed to do. It was designed for a scale of metadata dimensions that industrial sensor data exceeds.

What InfluxDB 3 changed

InfluxDB 3 replaced the TSM engine with Apache Parquet columnar storage. Columnar storage handles high-cardinality data significantly better. The series ceiling that defines InfluxDB 1.x and 2.x does not exist in the same form in InfluxDB 3. This is a genuine improvement.

Two things matter if you are currently on InfluxDB 1.x or 2.x. InfluxDB 3 does not support an in-place upgrade from earlier versions. Data must be re-ingested via ETL. The migration cost is real regardless of destination.

The query language history is also relevant. InfluxDB 2.x replaced InfluxQL with Flux, a proprietary scripting language. InfluxDB 3 then deprecated Flux and returned to SQL and InfluxQL. Teams that rebuilt queries in Flux rebuilt them again. CrateDB has supported standard SQL throughout.

If you are running InfluxDB 1.x or 2.x, the question is not whether to migrate. The question is where. That decision should be based on cross-system join requirements, deployment model, and the next three years of sensor data growth, not only on solving today's cardinality ceiling.

How CrateDB stores the same data

CrateDB stores all sensor readings in a single table, regardless of sensor type. There is no series model. There is no tag-combination index.

A table that holds 900 sensor types, 50 assets, and 10 production plants:

-- Single table for all sensor types across all assets and plants
-- New sensor types are absorbed at ingest without a schema change
CREATE TABLE sensor_readings (
    ts          TIMESTAMP WITH TIME ZONE,
    sensor_type TEXT,
    asset_id    TEXT,
    plant_id    TEXT,
    value       DOUBLE PRECISION,
    metadata    OBJECT(DYNAMIC)
);

CrateDB indexes every column on ingestion using a shared-nothing architecture that distributes data across cluster nodes. A query across all sensor types in a single plant runs as a column-filtered scan against one table. Query latency does not grow as sensor type count grows.

The query that requires InfluxDB to resolve 450,000 series looks like this in CrateDB:

-- Average and peak readings by sensor type, one plant, last 4 hours
SELECT
    sensor_type,
    DATE_TRUNC('hour', ts)  AS hour,
    AVG(value)              AS avg_value,
    MAX(value)              AS peak_value
FROM sensor_readings
WHERE
    plant_id = 'plant-07'
    AND ts >= NOW() - INTERVAL '4 hours'
GROUP BY sensor_type, hour
ORDER BY hour DESC, sensor_type;

One table. One scan. One result.

Cross-system joins in standard SQL

Industrial analytics rarely stays within sensor data alone. OEE root-cause analysis correlates vibration readings with ERP downtime codes. Shift-context queries join production output with operator assignments. InfluxDB does not support these joins natively. Correlating sensor data with records from a separate ERP or MES system requires an application-layer merge, usually scheduled, always stale.

CrateDB handles the join in a single SQL query:

-- Correlate high-vibration events with logged downtime codes, last 24 hours
SELECT
    r.asset_id,
    r.sensor_type,
    AVG(r.value)       AS avg_vibration,
    d.downtime_code,
    d.description
FROM sensor_readings r
JOIN downtime_events d
    ON  r.asset_id = d.asset_id
    AND r.ts BETWEEN d.started_at AND d.ended_at
WHERE
    r.sensor_type = 'vibration'
    AND r.ts >= NOW() - INTERVAL '24 hours'
GROUP BY r.asset_id, r.sensor_type, d.downtime_code, d.description
ORDER BY avg_vibration DESC;


This query runs against data that arrived seconds ago. No export job. No application-layer merge.

Adding sensor types without a schema migration

New sensor types in InfluxDB introduce new tag value combinations and increase series count. At high cardinality, adding metadata dimensions has a measurable cost before the first byte of sensor data arrives.

The OBJECT(DYNAMIC) column in CrateDB absorbs new key-value pairs at ingest without a schema change. New sensor types appear in query results the first time they write data. No ALTER TABLE. No pipeline downtime.

ALPLA, a global packaging manufacturer with 181 production facilities, runs 900 sensor types in a single CrateDB table. Dashboard queries that previously took 3 to 5 minutes now return in milliseconds. New sensor types come online without a schema redesign.

The migration path from InfluxDB

If you are running Telegraf into InfluxDB today, the migration path to CrateDB requires a configuration change to the Telegraf output plugin. Your OT sources, Telegraf inputs, sensor tags, and ingestion pipeline stay in place. CrateDB accepts connections via the PostgreSQL wire protocol, and Telegraf supports this output natively.

A full migration guide with Telegraf configuration examples, including the output plugin diff and what to expect at each step, is in Migrating from InfluxDB to CrateDB: A Telegraf Output Plugin Swap Guide.

What to read next

For teams evaluating InfluxDB alternatives and comparing on specific capabilities (ingestion architecture, SQL support, cross-system join capability, and deployment options), the CrateDB vs. InfluxDB comparison page covers the full decision view. This post is the architecture explainer. That page is the decision tool.

For the broader industrial IoT database pattern (schema flexibility, multi-site visibility, OT protocol connectivity, and where different database architectures break down), Why Industrial IoT Data Breaks Traditional Databases covers the structural constraints that reach beyond any single vendor's storage model.

To run the query model against live sensor data before making any infrastructure decision, the Real-Time Weather Monitoring Guided Path runs in one Docker command. No account required. A live Grafana dashboard in under 30 minutes.