import-jobs and export-jobs

The clusters import-jobs and clusters export jobs commands let you manage, respectively, the import and export jobs in your CrateDB Cloud cluster.

More information about importing files and ingesting data can be found in the import data documentation.

More information about exporting data can be found in the export data documentation.

Note

Most JSON, CSV and Parquet files are supported.

Tip

Import jobs are the easiest way to get data into CrateDB Cloud. Use them to import from a local file, an arbitrary URL, or from an AWS S3-compatible service.

clusters import-jobs

Usage: croud clusters import-jobs [-h] {delete,list,create} ...

clusters import-jobs create

Import data from a file.

Usage: croud clusters import-jobs create [-h]
                                         {from-url,from-file,from-s3,from-dynamodb,from-azure-blob-storage}
                                         ...

clusters import-jobs create from-url

Note

This command will wait for the operation to finish or fail.

Create a data import job on the specified cluster from a url.

Usage: croud clusters import-jobs create from-url [-h] --url URL --cluster-id
                                                  CLUSTER_ID --table TABLE
                                                  [--create-table {True,False}]
                                                  --file-format
                                                  {csv,json,parquet}
                                                  [--compression {gzip,none}]
                                                  [--transformations TRANSFORMATIONS]
                                                  [--region REGION]
                                                  [--output-fmt {table,wide,json,yaml}]
                                                  [--sudo]
Required Arguments
--url

The URL the import file will be read from.

--cluster-id

The cluster the data will be imported into.

--table

The table the data will be imported into.

--file-format

Possible choices: csv, json, parquet

The format of the structured data in the file.

Optional Arguments
--create-table

Possible choices: True, False

Whether the table should be created automatically if it does not exist. If true new columns will also be added when the data requires them.

--compression

Possible choices: gzip, none

The compression method the file uses.

--transformations

The transformations to apply when fetching data. This is the SELECT statement from an SQL query that is executed on the loaded data before inserting into CrateDB. This can be used to apply arbitrary SQL functions on your data before inserting into CrateDB, i.e. UNNEST(), SUM() and similar.

--region, -r

Temporarily use the specified region that command will be run in.

--output-fmt, --format, -o

Possible choices: table, wide, json, yaml

Change the formatting of the output.

--sudo

Run the given command as superuser.

Default: False

Example
sh$ croud clusters import-jobs create from-url --cluster-id e1e38d92-a650-48f1-8a70-8133f2d5c400 \
    --file-format csv --table my_table_name --url https://s3.amazonaws.com/my.import.data.gz --compression gzip
+--------------------------------------+--------------------------------------+------------+
| id                                   | cluster_id                           | status     |
|--------------------------------------+--------------------------------------+------------|
| dca4986d-f7c8-4121-af81-863cca1dab0f | e1e38d92-a650-48f1-8a70-8133f2d5c400 | REGISTERED |
+--------------------------------------+--------------------------------------+------------+
==> Info: Status: REGISTERED (Your import job was received and is pending processing.)
==> Info: Done importing 3 records and 36 Bytes.
==> Success: Operation completed.

clusters import-jobs create from-file

Note

This command will wait for the operation to finish or fail.

Create a data import job on the specified cluster from a file.

The file can be uploaded beforehand using the croud organizations files command, or you can specify a local file path.

Usage: croud clusters import-jobs create from-file [-h] [--file-id FILE_ID]
                                                   [--file-path FILE_PATH]
                                                   --cluster-id CLUSTER_ID
                                                   --table TABLE
                                                   [--create-table {True,False}]
                                                   --file-format
                                                   {csv,json,parquet}
                                                   [--compression {gzip,none}]
                                                   [--transformations TRANSFORMATIONS]
                                                   [--region REGION]
                                                   [--output-fmt {table,wide,json,yaml}]
                                                   [--sudo]
Required Arguments
--cluster-id

The cluster the data will be imported into.

--table

The table the data will be imported into.

--file-format

Possible choices: csv, json, parquet

The format of the structured data in the file.

Optional Arguments
--file-id

The file ID that will be used for the import. If not specified then --file-path must be specified. Please refer to croud organizations files for more info.

--file-path

The file in your local filesystem that will be used. If not specified then --file-id must be specified. Please note the file will become visible under croud organizations files list.

--create-table

Possible choices: True, False

Whether the table should be created automatically if it does not exist. If true new columns will also be added when the data requires them.

--compression

Possible choices: gzip, none

The compression method the file uses.

--transformations

The transformations to apply when fetching data. This is the SELECT statement from an SQL query that is executed on the loaded data before inserting into CrateDB. This can be used to apply arbitrary SQL functions on your data before inserting into CrateDB, i.e. UNNEST(), SUM() and similar.

--region, -r

Temporarily use the specified region that command will be run in.

--output-fmt, --format, -o

Possible choices: table, wide, json, yaml

Change the formatting of the output.

--sudo

Run the given command as superuser.

Default: False

Example
sh$ croud clusters import-jobs create from-file --cluster-id e1e38d92-a650-48f1-8a70-8133f2d5c400 \
    --file-format csv --table my_table_name --file-id 2e71e5a6-a21a-4e99-ae58-705a1f15635c
+--------------------------------------+--------------------------------------+------------+
| id                                   | cluster_id                           | status     |
|--------------------------------------+--------------------------------------+------------|
| 9164f886-ae37-4a1b-b3fe-53f9e1897e7d | e1e38d92-a650-48f1-8a70-8133f2d5c400 | REGISTERED |
+--------------------------------------+--------------------------------------+------------+
==> Info: Status: REGISTERED (Your import job was received and is pending processing.)
==> Info: Done importing 3 records and 36 Bytes.
==> Success: Operation completed.

clusters import-jobs create from-s3

Note

This command will wait for the operation to finish or fail.

Create a data import job on the specified cluster from an Amazon S3 compatible location.

Usage: croud clusters import-jobs create from-s3 [-h] --bucket BUCKET
                                                 --file-path FILE_PATH
                                                 --secret-id SECRET_ID
                                                 [--endpoint ENDPOINT]
                                                 --cluster-id CLUSTER_ID
                                                 --table TABLE
                                                 [--create-table {True,False}]
                                                 --file-format
                                                 {csv,json,parquet}
                                                 [--compression {gzip,none}]
                                                 [--transformations TRANSFORMATIONS]
                                                 [--region REGION]
                                                 [--output-fmt {table,wide,json,yaml}]
                                                 [--sudo]
Required Arguments
--bucket

The name of the S3 bucket that contains the file to be imported.

--file-path

The absolute path in the S3 bucket that points to the file to be imported. Globbing (use of *) is allowed.

--secret-id

The secret that contains the access key and secret key needed to access the file to be imported.

--cluster-id

The cluster the data will be imported into.

--table

The table the data will be imported into.

--file-format

Possible choices: csv, json, parquet

The format of the structured data in the file.

Optional Arguments
--endpoint

An Amazon S3 compatible endpoint.

--create-table

Possible choices: True, False

Whether the table should be created automatically if it does not exist. If true new columns will also be added when the data requires them.

--compression

Possible choices: gzip, none

The compression method the file uses.

--transformations

The transformations to apply when fetching data. This is the SELECT statement from an SQL query that is executed on the loaded data before inserting into CrateDB. This can be used to apply arbitrary SQL functions on your data before inserting into CrateDB, i.e. UNNEST(), SUM() and similar.

--region, -r

Temporarily use the specified region that command will be run in.

--output-fmt, --format, -o

Possible choices: table, wide, json, yaml

Change the formatting of the output.

--sudo

Run the given command as superuser.

Default: False

Example
sh$ croud clusters import-jobs create from-s3 --cluster-id e1e38d92-a650-48f1-8a70-8133f2d5c400 \
    --secret-id 71e7c5da-51fa-44f2-b178-d95052cbe620 --bucket cratedbtestbucket \
    --file-path myfiles/cratedbimporttest.csv --file-format csv --table my_table_name
+--------------------------------------+--------------------------------------+------------+
| id                                   | cluster_id                           | status     |
|--------------------------------------+--------------------------------------+------------|
| f29fdc02-edd0-4ad9-8839-9616fccf752b | e1e38d92-a650-48f1-8a70-8133f2d5c400 | REGISTERED |
+--------------------------------------+--------------------------------------+------------+
==> Info: Status: REGISTERED (Your import job was received and is pending processing.)
==> Info: Done importing 3 records and 36 Bytes.
==> Success: Operation completed.

clusters import-jobs create from-azure-blob-storage

Note

This command will wait for the operation to finish or fail.

Create a data import job on the specified cluster from an Azure blob storage location.

Usage: croud clusters import-jobs create from-azure-blob-storage
       [-h] --container-name CONTAINER_NAME --blob-name BLOB_NAME --secret-id
       SECRET_ID --cluster-id CLUSTER_ID --table TABLE
       [--create-table {True,False}] --file-format {csv,json,parquet}
       [--compression {gzip,none}] [--transformations TRANSFORMATIONS]
       [--region REGION] [--output-fmt {table,wide,json,yaml}] [--sudo]
Required Arguments
--container-name

The name of the storage container where the file to be imported is located.

--blob-name

The absolute path in the storage container that points to the file to be imported. Globbing (use of *) is allowed.

--secret-id

The secret that contains the access key and secret key needed to access the file to be imported.

--cluster-id

The cluster the data will be imported into.

--table

The table the data will be imported into.

--file-format

Possible choices: csv, json, parquet

The format of the structured data in the file.

Optional Arguments
--create-table

Possible choices: True, False

Whether the table should be created automatically if it does not exist. If true new columns will also be added when the data requires them.

--compression

Possible choices: gzip, none

The compression method the file uses.

--transformations

The transformations to apply when fetching data. This is the SELECT statement from an SQL query that is executed on the loaded data before inserting into CrateDB. This can be used to apply arbitrary SQL functions on your data before inserting into CrateDB, i.e. UNNEST(), SUM() and similar.

--region, -r

Temporarily use the specified region that command will be run in.

--output-fmt, --format, -o

Possible choices: table, wide, json, yaml

Change the formatting of the output.

--sudo

Run the given command as superuser.

Default: False

Example
sh$ croud clusters import-jobs create from-azure-blob-storage --secret-id c1ff327d-ee9f-4903-b7f2-8ea9c5be898c \
   --container-name my-container --blob-name "my/blob.csv" --cluster-id e1e38d92-a650-48f1-8a70-8133f2d5c400 \
   --file-format csv --table my_table
+--------------------------------------+--------------------------------------+------------+
| id                                   | cluster_id                           | status     |
|--------------------------------------+--------------------------------------+------------|
| f1f9906b-d038-466b-ac48-e6a041d1076a | e1e38d92-a650-48f1-8a70-8133f2d5c400 | REGISTERED |
+--------------------------------------+--------------------------------------+------------+
==> Info: Status: REGISTERED (Your import job was received and is pending processing.)
==> Info: Status: SENT (Your creation request was sent to the region.)
==> Info: Done importing 14.73K records
==> Success: Operation completed.

clusters import-jobs create from-dynamodb

Note

For IMPORT_ONLY, this command will wait for the operation to finish or fail. When –ingestion-type is set to CDC_ONLY or IMPORT_AND_CDC, the command will not finish and even when the last CDC event is processed, it will remain waiting for new CDC events to come.

Create a data import job on the specified cluster from an Amazon DynamoDB compatible location.

Usage: croud clusters import-jobs create from-dynamodb [-h] --ingestion-type
                                                       {IMPORT_ONLY,IMPORT_AND_CDC,CDC_ONLY}
                                                       --aws-region AWS_REGION
                                                       --dynamodb-table
                                                       DYNAMODB_TABLE
                                                       [--kinesis-stream-name KINESIS_STREAM_NAME]
                                                       --secret-id SECRET_ID
                                                       [--endpoint ENDPOINT]
                                                       --cluster-id CLUSTER_ID
                                                       --table TABLE
                                                       [--create-table {True,False}]
                                                       [--region REGION]
                                                       [--output-fmt {table,wide,json,yaml}]
                                                       [--sudo]
Required Arguments
--ingestion-type

Possible choices: IMPORT_ONLY, IMPORT_AND_CDC, CDC_ONLY

Determines how to ingest the data. IMPORT_ONLY will just ingest the data and finish. CDC_ONLY will continuously read CDC (Change Data Capture) events. IMPORT_AND_CDC will first import the data and then start listening for CDC events.

--aws-region

The name of the AWS region where the DynamoDB table is located.

--dynamodb-table

The name of the DynamoDB table.

--secret-id

The secret that contains the access key and secret key needed to access the table to be imported.

--cluster-id

The cluster the data will be imported into.

--table

The table the data will be imported into.

Optional Arguments
--kinesis-stream-name

The name of the Kinesis Stream that will be used to read CDC events from. Only for CDC mode.

--endpoint

An AWS DynamoDB compatible endpoint.

--create-table

Possible choices: True, False

Whether the table should be created automatically if it does not exist. If true new columns will also be added when the data requires them.

--region, -r

Temporarily use the specified region that command will be run in.

--output-fmt, --format, -o

Possible choices: table, wide, json, yaml

Change the formatting of the output.

--sudo

Run the given command as superuser.

Default: False

sh$ croud clusters import-jobs create from-dynamodb --cluster-id e1e38d92-a650-48f1-8a70-8133f2d5c400 \
    --secret-id 71e7c5da-51fa-44f2-b178-d95052cbe620 --aws-region eu-west-1 \
    --dynamodb-table my_dynamodb_table_name --kinesis-stream-name my_kinesis_stream_name \
    --table my_table_name --ingestion-type IMPORT_AND_CDC
+--------------------------------------+--------------------------------------+------------+
| id                                   | cluster_id                           | status     |
|--------------------------------------+--------------------------------------+------------|
| f29fdc02-edd0-4ad9-8839-9616fccf752b | e1e38d92-a650-48f1-8a70-8133f2d5c400 | REGISTERED |
+--------------------------------------+--------------------------------------+------------+
==> Info: Status: REGISTERED (Your import job was received and is pending processing.)

clusters import-jobs list

List all import jobs for a cluster.

Usage: croud clusters import-jobs list [-h] --cluster-id CLUSTER_ID
                                       [--region REGION]
                                       [--output-fmt {table,wide,json,yaml}]
                                       [--sudo]

Required Arguments

--cluster-id

The cluster the import jobs belong to.

Optional Arguments

--region, -r

Temporarily use the specified region that command will be run in.

--output-fmt, --format, -o

Possible choices: table, wide, json, yaml

Change the formatting of the output.

--sudo

Run the given command as superuser.

Default: False

Example

sh$ croud clusters import-jobs list --cluster-id e1e38d92-a650-48f1-8a70-8133f2d5c400
+--------------------------------------+--------------------------------------+-----------+--------+-------------------+
| id                                   | cluster_id                           | status    | type   | destination       |
|--------------------------------------+--------------------------------------+-----------+--------+-------------------|
| dca4986d-f7c8-4121-af81-863cca1dab0f | e1e38d92-a650-48f1-8a70-8133f2d5c400 | SUCCEEDED | url    | my_table_name     |
| 00de6048-3af6-41da-bfaa-661199d1c106 | e1e38d92-a650-48f1-8a70-8133f2d5c400 | SUCCEEDED | s3     | my_table_name     |
| 035f5ec1-ba9e-4a5c-9ce1-44e9a9cab6c1 | e1e38d92-a650-48f1-8a70-8133f2d5c400 | SUCCEEDED | file   | my_table_name     |
+--------------------------------------+--------------------------------------+-----------+--------+-------------------+

clusters import-jobs delete

Delete a data import job from the job history if it has already finished. Otherwise, cancel the running import job.

Usage: croud clusters import-jobs delete [-h] --cluster-id CLUSTER_ID
                                         --import-job-id IMPORT_JOB_ID
                                         [--region REGION]
                                         [--output-fmt {table,wide,json,yaml}]
                                         [--sudo]

Required Arguments

--cluster-id

The cluster the import job belongs to.

--import-job-id

The ID of the import job.

Optional Arguments

--region, -r

Temporarily use the specified region that command will be run in.

--output-fmt, --format, -o

Possible choices: table, wide, json, yaml

Change the formatting of the output.

--sudo

Run the given command as superuser.

Default: False

Example

sh$  croud clusters import-jobs delete \
      --cluster-id e1e38d92-a650-48f1-8a70-8133f2d5c400 \
      --import-job-id 00de6048-3af6-41da-bfaa-661199d1c106
==> Success: Success.

clusters export-jobs

Manage data export from your CrateDB cluster into a file.

Usage: croud clusters export-jobs [-h] {delete,list,create} ...

clusters export-jobs create

Note

This command will wait for the operation to finish or fail.

It is only available to organization admins.

Export data from a CrateDB cluster to a file. The exported data can be downloaded from a URL once the export job is completed or saved on your local filesystem.

Usage: croud clusters export-jobs create [-h] --cluster-id CLUSTER_ID --table
                                         TABLE --file-format
                                         {csv,json,parquet}
                                         [--compression {gzip,none}]
                                         [--save-as SAVE_AS] [--region REGION]
                                         [--output-fmt {table,wide,json,yaml}]
                                         [--sudo]

Required Arguments

--cluster-id

The cluster the data will be exported from.

--table

The table the data will be exported from.

--file-format

Possible choices: csv, json, parquet

The format of the data in the file.

Optional Arguments

--compression

Possible choices: gzip, none

The compression method of the exported file.

--save-as

The file on your local filesystem the data will be exported to. If not specified, you will receive the URL to download the file.

--region, -r

Temporarily use the specified region that command will be run in.

--output-fmt, --format, -o

Possible choices: table, wide, json, yaml

Change the formatting of the output.

--sudo

Run the given command as superuser.

Default: False

Example

sh$ croud clusters export-jobs create --cluster-id f6c39580-5719-431d-a508-0cee4f9e8209 \
      --table nyc_taxi --file-format csv
+--------------------------------------+--------------------------------------+------------+
| id                                   | cluster_id                           | status     |
|--------------------------------------+--------------------------------------+------------|
| 85dc0024-b049-4b9d-b100-4bf850881692 | f6c39580-5719-431d-a508-0cee4f9e8209 | REGISTERED |
+--------------------------------------+--------------------------------------+------------+
==> Info: Status: SENT (Your creation request was sent to the region.)
==> Info: Status: IN_PROGRESS (Export in progress)
==> Info: Exporting... 2.00 K records and 19.53 KiB exported so far.
==> Info: Exporting... 4.00 K records and 39.06 KiB exported so far.
==> Info: Done exporting 6.00 K records and 58.59 KiB.
==> Success: Download URL: https://cratedb-file-uploads.s3.amazonaws.com/some/download
==> Success: Operation completed.

clusters export-jobs list

Lists all export jobs for a cluster.

Usage: croud clusters export-jobs list [-h] --cluster-id CLUSTER_ID
                                       [--region REGION]
                                       [--output-fmt {table,wide,json,yaml}]
                                       [--sudo]

Required Arguments

--cluster-id

The cluster the export jobs belong to.

Optional Arguments

--region, -r

Temporarily use the specified region that command will be run in.

--output-fmt, --format, -o

Possible choices: table, wide, json, yaml

Change the formatting of the output.

--sudo

Run the given command as superuser.

Default: False

Example

sh$  croud clusters export-jobs list \
      --cluster-id f6c39580-5719-431d-a508-0cee4f9e8209
+--------------------------------------+--------------------------------------+-----------+---------------------+-----------------------------------------------+
| id                                   | cluster_id                           | status    | source              | destination                                   |
|--------------------------------------+--------------------------------------+-----------+---------------------+-----------------------------------------------|
| b311ba9d-9cb4-404a-b58d-c442ae251dbf | f6c39580-5719-431d-a508-0cee4f9e8209 | SUCCEEDED | nyc_taxi            | Format: csv                                   |
|                                      |                                      |           |                     | File ID: 327ad0e6-607f-4f99-a4cc-c1e98bf28e4d |
+--------------------------------------+--------------------------------------+-----------+---------------------+-----------------------------------------------+

clusters export-jobs delete

Delete a data export job from the job history if it has already finished. Otherwise, cancel the running export job.

Usage: croud clusters export-jobs delete [-h] --cluster-id CLUSTER_ID
                                         --export-job-id EXPORT_JOB_ID
                                         [--region REGION]
                                         [--output-fmt {table,wide,json,yaml}]
                                         [--sudo]

Required Arguments

--cluster-id

The cluster the job belongs to.

--export-job-id

The ID of the export job.

Optional Arguments

--region, -r

Temporarily use the specified region that command will be run in.

--output-fmt, --format, -o

Possible choices: table, wide, json, yaml

Change the formatting of the output.

--sudo

Run the given command as superuser.

Default: False

Example

sh$ croud clusters export-jobs delete \
      --cluster-id f6c39580-5719-431d-a508-0cee4f9e8209 \
      --export-job-id 3b311ba9d-9cb4-404a-b58d-c442ae251dbf
==> Success: Success.