Load and Export¶
You have a variety of options to connect and integrate with 3rd-party ETL applications, mostly using CrateDB’s PostgreSQL interface.
This documentation section lists corresponding ETL applications and frameworks which can be used together with CrateDB, and outlines how to use them optimally. Please also have a look at support for Change Data Capture (CDC) solutions.
Apache Airflow / Astronomer¶
A set of starter tutorials.
Updating stock market data automatically with CrateDB and Apache Airflow
Automating stock data collection and storage with CrateDB and Apache Airflow
A set of elaborated tutorials, including blueprint implementations.
Automating export of CrateDB data to S3 using Apache Airflow
Implementing a data retention policy in CrateDB using Apache Airflow
CrateDB and Apache Airflow: Building a data ingestion pipeline
Building a hot and cold storage data retention policy in CrateDB with Apache Airflow
Tutorials and resources about configuring the managed variants, Astro and CrateDB Cloud.
Apache Flink¶
Apache Hop¶
Apache Kafka¶
Apache NiFi¶
AWS DMS¶
AWS Database Migration Service (AWS DMS) is a managed migration and replication service that helps move your database and analytics workloads between different kinds of databases quickly, securely, and with minimal downtime and zero data loss. It supports migration between 20-plus database and analytics engines.
AWS DMS supports migration between 20-plus database and analytics engines, either on-premises, or per EC2 instance databases. Supported data migration sources are: Amazon Aurora, Amazon DocumentDB, Amazon S3, IBM DB2, MariaDB, Azure SQL Database, Microsoft SQL Server, MongoDB, MySQL, Oracle, PostgreSQL, SAP ASE.
The AWS DMS Integration with CrateDB uses Amazon Kinesis Data Streams as a DMS target, combined with a CrateDB-specific downstream processor element.
CrateDB provides two variants how to conduct data migrations using AWS DMS. Either use it standalone / on your own premises, or use it in a completely managed environment with services of AWS and CrateDB Cloud.
AWS Kinesis¶
Amazon Kinesis Data Streams is a serverless streaming data service that simplifies the capture, processing, and storage of data streams at any scale, such as application logs, website clickstreams, and IoT telemetry data, for machine learning (ML), analytics, and other applications.
The DynamoDB CDC Relay pipeline uses Amazon Kinesis to relay a table change stream from a DynamoDB table into a CrateDB table, see also DynamoDB CDC.
Azure Functions¶
dbt¶
DynamoDB¶
InfluxDB¶
Kestra¶
MongoDB¶
Tutorial: Import data from MongoDB
Documentation: MongoDB Table Loader
Documentation: MongoDB CDC Relay
MySQL¶
Node-RED¶
Singer / Meltano¶
🚧 Please note these adapters are a work in progress. 🚧
SQL Server Integration Services¶
A demo project which uses SSIS and ODBC to read and write data from CrateDB: