Your production floor generates sensor data every millisecond. Legal says it cannot leave the facility. Most analytics databases built in the last decade were designed for that data to transit a cloud endpoint. They were not designed for you.
Three compliance layers, not one
For DACH manufacturers, data residency is not a single checkbox. It arrives from three directions at once.
GDPR establishes baseline requirements for where data about EU subjects is processed, but for manufacturing data it is rarely the binding constraint on its own. Germany's IT Security Act (IT-Sicherheitsgesetz 2.0) adds sector-specific obligations for operators of critical infrastructure in energy, manufacturing, and transport that set stricter requirements for data handling and system security. Austria and Switzerland have equivalent national-level requirements in regulated sectors.
The third layer is OT network architecture. Industrial cybersecurity frameworks, including IEC 62443, BSI IT-Grundschutz for industrial systems, and the internal security policies that most DACH manufacturers derive from them, require the production network to be isolated from the public internet. Outbound connections from the OT segment to cloud endpoints are prohibited, not discouraged. This is an architectural enforcement point, not a configuration preference.
Together, these three constraints produce a hard requirement: your analytics database has to run inside the facility, on your network, without a cloud intermediary in the data path.
Why the compliance gap shows up before the proof of concept
Most cloud-first analytics databases address data residency by pointing to the region where their cloud storage runs. This satisfies GDPR data residency for the storage location, but it does not solve the problem when the production network cannot transmit data to a cloud ingestion endpoint in the first place.
The issue surfaces early in evaluations. The OT security team reviews the data path. They see traffic leaving the production network segment toward a cloud endpoint, regardless of where that endpoint is geographically located. That path is blocked by network policy. The evaluation stops before the proof of concept starts.
This is the compliance gap: tools with the right features running in the wrong architecture. No amount of configuration resolves it when the constraint is at the network perimeter.
What a facility-native deployment actually requires
Running analytics on production sensor data inside the facility means the database has to handle distribution, replication, and multi-site coordination without cloud infrastructure doing the coordination work.
Distributed clustering with no external coordination. CrateDB uses a shared-nothing architecture. Every node is equal. The cluster handles sharding, replication, and rebalancing internally, with no external metadata service or cloud-based control plane. A CrateDB cluster in a Bavarian plant operates independently of any public endpoint. For the OT connectivity layer that feeds into this architecture, see Modern Data Historian: Real-Time Industrial and IoT Data Platform.
Standard SQL from local applications. CrateDB exposes a PostgreSQL wire protocol. Grafana dashboards, BI tools, and custom applications connect over the same TCP connection they use for PostgreSQL, entirely within the facility network. Engineers query production data using standard SQL: no proprietary language, no cloud console, no internet connection required.
Multi-site analytics through selective replication. CrateDB does not run federated queries across facility nodes in real time. Instead, each facility node replicates selected tables to a central CrateDB cluster over private WAN. You choose which tables replicate. Operational metrics go to the central cluster for cross-plant analysis. Tables containing commercially sensitive process data or worker-linked records stay on the local node and never leave the facility. Cross-plant queries run on the central cluster against the replicated tables only.
Once data from all facilities is replicated to the central cluster, a cross-plant availability query looks like this:
SELECT plant_id, AVG(availability_pct) AS avg_availability, SUM(downtime_minutes) AS total_downtime_min FROM production_metrics WHERE ts > NOW() - INTERVAL '24 hours' GROUP BY plant_id ORDER BY avg_availability DESC;
That query runs on the central cluster against replicated data. No facility node is queried directly. For teams already running real-time OEE dashboards, the same SQL patterns apply on-premises without modification. See OEE Analytics on Live Data: How to Move from Nightly Exports to Real-Time Dashboards.
Rauch Group: 400 data records per second, inside the facility
Rauch Group, an Austrian food manufacturing company, runs real-time production monitoring on CrateDB at 400 data records per second. Sensor data from the production floor is ingested, stored, and queryable in real time, without a cloud intermediary in the data path.
The architecture satisfies two requirements at once: sub-second query latency on live production data, and full data residency within the facility. For a food manufacturer operating under EU food safety regulations and Austrian data handling requirements, neither is negotiable. A single analytics database running inside the facility handles both.
When on-premises is the right path
CrateDB Enterprise on-premises deployment is the right choice when any of the following apply:
- Your OT network prohibits outbound internet connectivity from the production segment
- Legal or compliance review has blocked cloud-first tools at the data residency step
- You need cross-plant analytics and can replicate facility data to a central on-premises cluster without routing replication through a cloud service
- Your security policy requires data to remain within a specific country, facility, or network perimeter
- A sector-specific regulation in energy, critical infrastructure, or food safety imposes data handling requirements that cloud deployment cannot meet
CrateDB OSS and CrateDB Cloud are the right paths for development environments, proof of concepts on anonymized data, and operational analytics workloads where cloud ingestion is permitted. Hybrid architectures are common in practice: CrateDB Cloud for development and non-sensitive analytics, CrateDB Enterprise on-premises for production data. Both run the same CrateDB binary and the same SQL. Data can replicate between environments where compliance permits.
Deployment: edge to cluster
For factory-edge deployments, a single CrateDB node per production line or per facility aggregates data before feeding into a facility-level cluster. The same binary runs on edge hardware with constrained resources. Data historians feeding CrateDB over OPC-UA or MQTT route to a local CrateDB node. The operational analytics stack runs entirely inside the facility boundary. For the OPC-UA and MQTT ingestion setup, see How to Ingest OPC-UA and MQTT Data into SQL with Telegraf and CrateDB.
CrateDB Enterprise adds the security certifications, role-based access controls, and enterprise support SLAs that regulated manufacturing environments require for production deployment. For the full edge architecture picture, including AI and vector workloads at the factory floor, see Smarter Edge, Faster Insights: How CrateDB Powers Real-Time IoT and AI at the Edge.
Analytics that respects the constraint
DACH manufacturers are not choosing between analytics and compliance. They are choosing between analytics databases designed for their architecture and those that are not.
Production sensor data that stays inside the facility can still drive real-time OEE dashboards, predictive maintenance models, and cross-plant performance comparisons. The database just has to be running inside the facility alongside that data.
To discuss on-premises and edge deployment options for CrateDB Enterprise, talk to a Solutions Engineer.