Product
CrateDB + Apache Airflow
Simplify your orchestration workflows with seamless automation.

CrateDB is a hyper-fast opensource distributed database technology that combines the best out of the SQL and NoSQL worlds. It supports the PostgreSQL wire protocol for easy integration with many data engineering tools.

Apache Airflow is a popular tool for creating, scheduling, and monitoring data pipelines, with over 9 million downloads every month, it is the de facto standard for expressing data flows as Python code.

The combination of CrateDB and Apache Airflow is a powerful solution for businesses and organizations that need to manage and analyze large amounts of data. By leveraging both tools, you can easily automate orchestration workflows and build efficient pipelines that can process data quickly and accurately, leading to valuable insights.
Learn how to get started using Apache Airflow and CrateDB
More resources on the combined value of Apache Airflow and CrateDB
Blog
Implementing a data retention policy in CrateDB using Apache Airflow
March 18, 2023
A data retention policy describes the practice of storing and managing data for a designated period of time. Once a data set completes its retention period, it should be deleted or archived, depending on requirements.
TUTORIAL
Automating the import of Parquet files with Apache Airflow
March 19, 2023
The workflow represented in this tutorial is a simple way to import Parquet files to CrateDB by transforming them into a .csv file. As previously mentioned, there are other approaches out there, we encourage you to try them out.