The CrateDB Cloud has significantly enhanced its capabilities throughout this year, particularly with the introduction of its versatile import functionality. This powerful feature now allows users to import data from various sources directly into their CrateDB Cloud clusters with a few clicks. Here's a summary of the advancements that we delivered so far:
The import system now boasts an array of functionalities that cater to a wide range of data sources and formats, significantly streamlining the data integration process for CrateDB Cloud users.
CrateDB Cloud simplifies data integration with its automatic schema inference, reducing setup time by intelligently mapping imported data to corresponding CrateDB types. It defaults to OBJECT(IGNORED) for object types to handle heterogeneous schemas, but also allows manual table creation with specific type mappings for homogeneous data. This feature balances automation with customization, streamlining data preparation for users.
Security is paramount, especially in data import processes. Sensitive information like access keys, connection strings or SAS tokens are handled with the highest security standards.
CrateDB Cloud also offers robust export features, supporting the same formats as imports (JSON, JSON-Lines, CSV, Parquet). Although currently limited to 1 GiB in size, these exports can be downloaded and used for various purposes.
CrateDB Cloud now supports importing multiple files at once, a feature that greatly enhances the efficiency of handling large data sets.
Glob Patterns for Efficiency: Users can now use glob patterns to import multiple files simultaneously from S3-compatible storage and Azure Storage containers. This feature is particularly useful for services that generate numerous files with common naming conventions, like AWS CloudTrail or Segment.
For example, in handling CloudTrail logs, you can use a globbing pattern such as CloudTrail/us-east-1/2023/11/12/*.json.gz
to import all pertinent log files for a given day. The pattern /2023/11/12/
facilitates the upload of files from that specific day. Similarly, using /2023/11/*
allows for the upload of files from the entire month. This approach significantly streamlines the import process, enabling the efficient import of logs from a specific month or even an entire year with a single globbing pattern.
Import CloudTrail Logs from S3 bucket
Explore CrateDB Cloud and its import functionality by visiting https://console.cratedb.cloud to start your free CrateDB cluster today. For full details of the import/export system, visit our documentation.