To help you speed up with CrateDB and sharpen your skills in data engineering, we compiled a list of the best CrateDB tutorials for starters.
CrateDB is one of the most advanced distributed SQL databases. It allows you to store various types of data in a single system: including but not limited to relational, time-series, blob, object, and geospatial data. With CrateDB, you can run very fast queries with full-text search, aggregations, and JOINs, and at the same time, you are guaranteed high availability and unlimited scalability.
To make the best use of the features described above and more, and as a first-time user, you might wonder what the right place to start is. All of the tutorials are free to access and feature guidance from our best engineers.
For better focus and understanding, we divided the tutorials into three categories:
-
Setting up CrateDB: tutorials covering CrateDB installation, data import, and basic sharding and partitioning strategies
-
CrateDB queries: tutorials covering common queries in CrateDB, including object manipulations and user-defined functions
-
Monitoring CrateDB cluster: tutorials on monitoring CrateDB clusters with well-known OSS solutions
Setting up CrateDB
Fundamentals: Getting started with CrateDB
With this tutorial, you will learn how to start CrateDB for the first time on your local machine. It gives you general information on Admin UI, how to upload your first dataset to CrateDB and how to perform some simple queries. After following all the steps in this video tutorial, you will prepare for more complex operations in CrateDB.
Watch it now: Fundamentals: Getting Started with CrateDB
Fundamentals: Importing and Exporting Data in CrateDB
This tutorial presents the basics of COPY FROM
and COPY TO
statements in CrateDB. It demonstrates how to import JSON and CSV data from the local file system to the CrateDB running on the local Docker container.
Watch it now: Fundamentals: Importing and Exporting Data in CrateDB
COPY FROM statement: things you need to know
This tutorial illustrates how to effectively use the COPY FROM
statement in CrateDB and shows several options you should know in order to avoid common mistakes. We cover topics such as the RETURN SUMMARY
clause for error reporting, the import of TSV
and compressed files, and the import of CSV
files with different delimiters.
Read it now: COPY FROM statement: things you need to know
Sharding and Partitioning Guide for Time Series Data
The goal of this guide is to support you with building a sharding and partitioning strategy for your data. You will learn how to partition the data, how many shards to configure, and how many replicas to use depending on your cluster size, retention periods, and the amount of data.
Read it now: Sharding and Partitioning Guide for Time Series Data
CrateDB queries
Querying time series data with CrateDB Cloud: SQL examples
In this tutorial, you will discover interesting SQL queries that you can run on a public dataset in CrateDB. We run our examples on CrateDB Cloud free trial instance, but they are explorable on your local CrateDB instance too. Nevertheless, we would invite you to try out CrateDB Cloud and spin off your first cluster in less than 2 minutes.
Read it now: Querying time series data with CrateDB Cloud: SQL examples
Fundamentals: Getting started with CrateDB objects
Objects are one of the most important concepts in CrateDB. Follow this tutorial to discover how objects in CrateDB can add clarity to your data model. Furthermore, you will get an overview of the different column policies for objects in CrateDB and how these policies affect inserting new records into your table.
Watch it now: Fundamentals: Getting Started with CrateDB Objects
A collection of useful User-defined functions (UDFs)
User-defined functions in CrateDB allow you to extend the functionality provided out of the box with custom functionality. In this short tutorial, we show several examples of UDFs that you can use for your own use case or as a starting point for creating your own UDFs.
Read it now: A collection of useful User-Defined Functions (UDFs)
Monitoring CrateDB cluster
Monitoring an on-premises CrateDB cluster with Prometheus and Grafana
To check live and historical data around performance and capacity metrics in CrateDB check out how we implemented a Grafana dashboard with Prometheus. We use Prometheus, a well-known OSS solution, to collect and store both CrateDB and OS-specific metrics such as the number of queries per second, the average duration of queries, query error rates, and memory usage.
Read it now: Monitoring an on-premises CrateDB cluster with Prometheus and Grafana
Wrap up
This post summarizes some of the most important tutorials that can help you start and master CrateDB in a relatively short time.
This is a long list, so if you want to learn more, check our documentation and community site, where we regularly publish tutorials, inform you about upcoming events, and answer users' questions.