Skip to content
Blog

How to Add a New Sensor Type to Your Industrial Database Without Pipeline Downtime

A new piece of equipment arrives on the factory floor. Before a single reading reaches your dashboard, your data team must plan a schema migration. That migration means either a maintenance window, a dual-write workaround, or a backlog of unstructured readings waiting for a schema that hasn't been updated yet.

Industrial environments change faster than schemas do

Manufacturing, logistics, and energy operations add equipment continuously. New PLCs. Firmware upgrades that expose additional sensor fields. A new conveyor line that measures vibration where the old one only measured temperature. Each addition is a business event first and a data engineering problem second, but the data engineering problem always arrives on the same day.

Traditional relational databases and most time-series stores assume a fixed schema. You define columns at table creation time, and any field that doesn't fit a defined column either gets rejected, lands in an untyped blob, or triggers an ALTER TABLE statement. In an environment where sensor configurations change with every equipment refresh, that assumption breaks constantly.

The result is a choice between two bad options. You schedule a maintenance window for the schema migration and production data stops flowing while the migration runs. Or you maintain a dual-write pipeline: the old schema for existing sensors, a new schema for the new sensor type, and routing logic to direct each reading to the right table until you can consolidate. Dual-write spreads complexity across ingestion, query, and dashboard layers. It compounds with every new sensor type.

Why the problem compounds at scale

In a single distribution center, you might have thousands of sensors across dozens of asset types: temperature, vibration, current draw, pressure, cycle count, door state, conveyor speed. Not all of them emit the same fields. Firmware version 3.1 on a motor controller might emit torque. Version 2.8 didn't. A new batch of sensors from a different vendor uses a slightly different field naming convention for the same measurement.

A schema designed to accommodate every field across every asset type and firmware version becomes wide (hundreds of nullable columns, most of which are empty for any given sensor). A schema designed to be lean requires constant migration as new fields appear. Neither approach scales across multiple facilities with different equipment profiles.

The data team becomes a gatekeeper between the factory floor and the analytics layer. Every new sensor type lands on someone's backlog before it lands in a dashboard. The time between "equipment commissioned" and "data visible in production" is determined by migration scheduling, not by the data itself.

How CrateDB handles this with dynamic columns

CrateDB supports dynamic columns through its OBJECT(DYNAMIC) type. When you declare a column as a dynamic object, any field that arrives in that column is automatically added to the table schema at ingest: no ALTER TABLE, no migration window, no downtime.

The schema evolves with the data. New fields become queryable immediately. Existing data is unaffected.

This is different from storing sensor readings as an opaque JSON blob. CrateDB indexes the fields inside a dynamic object, which means you can filter, aggregate, and join on them the same way you would on a statically defined column. The flexibility is at the write side; the query capability is the same as a fixed schema.

Before and after: what the schema change looks like

Before: fixed schema approach

A fixed schema for sensor readings might look like this:

CREATE TABLE sensor_readings (
  sensor_id    TEXT,
  asset_type   TEXT,
  ts           TIMESTAMP WITH TIME ZONE,
  temperature  DOUBLE,
  vibration    DOUBLE,
  pressure     DOUBLE,
  current_draw DOUBLE
);

Adding a new sensor type that emits a torque field requires:

ALTER TABLE sensor_readings ADD COLUMN torque DOUBLE;

In traditional databases, an ALTER TABLE operation may lock the table during execution, depending on the table size, the specific operation, and the database configuration. In CrateDB, schema changes are designed to be performed online, so they do not require locking the table for reads and writes during execution. On a table receiving continuous writes from thousands of sensors, that is a production event requiring a planned window.

After: dynamic columns in CrateDB

With CrateDB, the same table uses a dynamic object column for the sensor payload:

CREATE TABLE sensor_readings (
  sensor_id  TEXT,
  asset_type TEXT,
  ts         TIMESTAMP WITH TIME ZONE,
  readings   OBJECT(DYNAMIC)
);

A temperature sensor writes:

INSERT INTO sensor_readings (sensor_id, asset_type, ts, readings)
VALUES (
  'motor-001',
  'conveyor-motor',
  NOW(),
  {"temperature": 72.4, "vibration": 0.003, "current_draw": 14.2}
);

When new motor controller firmware adds a torque field, the same table accepts it without any schema change:

INSERT INTO sensor_readings (sensor_id, asset_type, ts, readings)
VALUES (
  'motor-042',
  'conveyor-motor',
  NOW(),
  {"temperature": 71.1, "vibration": 0.004, "current_draw": 13.9, "torque": 198.5}
);

torque is now a sub-column of readings. It is indexed. You can query it immediately:

SELECT sensor_id, ts, readings['torque']
FROM sensor_readings
WHERE asset_type = 'conveyor-motor'
  AND readings['torque'] > 200
  AND ts > NOW() - INTERVAL '1 hour';

No migration window. No dual-write pipeline. No backlog.

What type does the new column get?
CrateDB infers the data type from the first record that includes the new field. Because torque: 198.5 is a floating-point number, CrateDB stores it as DOUBLE. That inferred type becomes a permanent part of the table schema, not a runtime guess on each read, but a committed column definition applied from the first insert onward.

You can confirm the updated schema with SHOW CREATE TABLE:

SHOW CREATE TABLE sensor_readings;

After the first insert containing torque, the output includes it as a typed sub-column inside the readings object:

CREATE TABLE "doc"."sensor_readings" (
   "sensor_id" TEXT,
   "asset_type" TEXT,
   "ts" TIMESTAMP WITH TIME ZONE,
   "readings" OBJECT(DYNAMIC) AS (
      "temperature" DOUBLE,
      "vibration" DOUBLE,
      "current_draw" DOUBLE,
      "torque" DOUBLE
   )
)

Type inference is a one-way commitment. Once torque is registered as DOUBLE, a subsequent insert that sends torque as a string will fail type validation. You get the flexibility of schema evolution at write time, and the type safety of a fixed schema on all subsequent writes.

What you keep when the schema is dynamic

Dynamic columns do not sacrifice query capability.

You can aggregate across dynamic fields the same way you aggregate across static columns. You can JOIN sensor_readings to a production schedule or an asset registry to calculate OEE metrics against live sensor data (the approach covered in depth in the OEE analytics post). You can apply GROUP BY, window functions, and time-series aggregations to dynamically added fields without rewriting queries or modifying schemas.

The dynamic column behavior applies at the object level, not the table level. You can mix static and dynamic columns in the same table. Fields that are known at design time (sensor_id, ts, asset_type) are defined statically for performance and clarity. Only the sensor payload, where variance is expected and ongoing, is dynamic.

The tradeoff is intentional: you decide at table design time which parts of the schema are stable and which parts should evolve. That decision maps directly to how industrial data actually behaves: stable identifiers and timestamps on one side, variable measurement payloads on the other.

Schema flexibility as an operational property

Schema migration is not a one-time event in industrial environments. Equipment refreshes, firmware updates, new vendor integrations, and edge deployments all produce new fields on a continuous basis. A data architecture that treats schema changes as exceptional events is constantly out of step with the normal rhythm of the factory floor.

CrateDB's dynamic columns treat schema evolution as an operational property of the database rather than an administrative task that requires planning, coordination, and downtime. The data team stops being a gatekeeper between new hardware and production dashboards.

For teams managing sensor data across multiple facilities (with different equipment profiles, different firmware versions, and different vendor conventions) that shift has a direct effect on how quickly new asset types move from commissioning to queryable data.

 

Query your sensor stream against a live industrial dataset, no installation required. Run queries on live data on cratedb.com/explore.

For the broader picture of what industrial IoT workloads require from a database, Why Industrial IoT Data Breaks Traditional Databases covers schema flexibility alongside the other constraints that industrial time-series data imposes. The Modern Data Historian overview covers how CrateDB fits into the OT/IT architecture where these sensor streams originate.