ingestr

ingestr is a command-line application for copying data from any source to any destination database. It supports CrateDB on both the source and destination sides. ingestr builds on dlt.

  • Single command: ingestr allows copying & ingesting data from any source to any destination with a single command.

  • Many sources & destinations: ingestr supports all common source and destination databases.

  • Incremental Loading: ingestr supports both full-refresh and incremental loading modes.

ingestr in a nutshell

Synopsis

Invoke ingestr for exporting data from CrateDB.

ingestr ingest \
    --source-uri 'crate://crate@localhost:4200/' \
    --source-table 'sys.summits' \
    --dest-uri 'duckdb:///cratedb.duckdb' \
    --dest-table 'dest.summits'

Invoke ingestr for loading data into CrateDB.

ingestr ingest \
   --source-uri 'csv://input.csv' \
   --source-table 'sample' \
   --dest-uri 'cratedb://crate:@localhost:5432/?sslmode=disable' \
   --dest-table 'doc.sample'

Note

Please note there are subtle differences between the CrateDB source and target URLs. While --source-uri=crate://... addresses CrateDB’s SQLAlchemy dialect, --dest-uri=cratedb://... is effectively a PostgreSQL connection URL with a protocol schema designating CrateDB. The source adapter uses CrateDB’s HTTP protocol, while the destination adapter uses CrateDB’s PostgreSQL interface.

Coverage

ingestr supports migration from 20-plus databases, data platforms, and analytics engines, including all databases supported by SQLAlchemy.

Traditional Databases

CockroachDB, CrateDB, Firebird, HyperSQL (hsqldb), IBM DB2 and Informix, Microsoft Access, Microsoft SQL Server, MonetDB, MySQL and MariaDB, OpenGauss, Oracle, PostgreSQL, SAP ASE, SAP HANA, SAP Sybase SQL Anywhere, SQLite, TiDB, YDB, YugabyteDB

Cloud Data Warehouses & Analytics

Amazon Athena, Amazon Redshift, Databend, Databricks, Denodo, DuckDB, EXASOL DB, Firebolt, Google BigQuery, Greenplum, IBM Netezza Performance Server, Impala, Kinetica, Rockset, Snowflake, Teradata Vantage

Specialized Data Stores

Apache Drill, Apache Druid, Apache Hive and Presto, Clickhouse, Elasticsearch, InfluxDB, MongoDB, OpenSearch

Message Brokers

Amazon Kinesis, Apache Kafka (Amazon MSK, Confluent Kafka, Redpanda, RobustMQ)

File Formats

CSV, JSONL/NDJSON, Parquet

Object Stores

Amazon S3, Google Cloud Storage

SaaS Platforms & Services

Airtable, Asana, GitHub, Google Ads, Google Analytics, Google Sheets, HubSpot, Notion, Personio, Salesforce, Slack, Stripe, Zendesk, etc.

Learn

Documentation: ingestr CrateDB source

Documentation about the CrateDB source adapter for ingestr.

https://bruin-data.github.io/ingestr/supported-sources/cratedb.html#source
Documentation: ingestr CrateDB destination

Documentation about the CrateDB destination adapter for ingestr.

https://bruin-data.github.io/ingestr/supported-sources/cratedb.html#destination
Examples: Use ingestr with CrateDB

Executable code examples / rig that demonstrates how to use ingestr to load data from Kafka to CrateDB.

https://github.com/crate/cratedb-examples/tree/main/application/ingestr

Video tutorials

A few video tutorials about using ingestr with Google Analytics, Shopify, and Kafka.

Ingest from Google Analytics
Ingest from Shopify
Ingest from Kafka