With the release of CrateDB v4.8, we’re adding logical replication, as well as several improvements and enhancements to the COPY FROM/TO command to CrateDB.
At a glance
- Logical Replication 🎉
- Improvements and enhancements to the COPY FROM/TO command
- Optional type validation when importing data
- Support for S3 compatible storage endpoints
- CSV file handling improvements
- Import CSV files without headers
- Import subset of data from CSV with headers
- Run operations asynchronous in the background
- Continue below for more details, or find the full release notes here
Logical Replication
CrateDB is designed for operational analytics use cases and with this release, we add additional functionality that enables users to replicate all or a select amount of data to another CrateDB Cluster.
Logical Replication allows users to replicate (publish) individual tables or all tables in a database, while all operations that are performed on the original table are being replicated on the other table. Multiple CrateDB clusters can subscribe to multiple publications, there can be a one-to-many relation.
Logical replication is particularly valuable for two use cases:
- Centralized reporting: Being able to consolidate data from any number of locations in a central place allows users to easily run analytics on data from multiple sources. It can be interesting to consolidate data from on-premises installations into a central cloud database, while still having the data collected and available where it is generated.
- Central storage and local replicas with low latency access: Replicate data from a large cluster to a large number of smaller localized clusters for low latency access. This way, you can collect data where it’s created and only replicate specific data to a local cluster, which allows you to access data from close to the location with a lower latency. There’s no need to duplicate data, as you only replicate selected data to a selected location
Enhancements to COPY FROM/TO
COPY FROM/TO is used to import or export data and is specifically important for initial data import, making it a key part of the initial onboarding.
The enhancements provide a more robust experience when importing data and extend the supported platforms used as data source/destination.
This functionality helps users handle CSV files more robustly, thereby reducing potential errors. It also makes initial or continuous data loading easier and less error prone. Specifically, the added type validation reduces potential errors and creates a better initial user experience.
Not requiring a header or allowing to skip specific columns when importing data gives users additional flexibility when it comes to the data sources.
Allowing the use of other S3 compatible storage enhances the flexibility and provides options beyond the traditional AWS S3 storage.
Read the full release notes here.