Hey folks! CrateDB 3.1 (stable) has been released and is now faster and easier to use than ever.
DOWNLOAD IT NOW, while it's hot.
The complete list of changes can be found in the release notes. In this post, I'll give you a quick tour of the highlights.
Faster Query Performance
Performance enhancements will mainly benefit applications that use GROUP BY clauses and those that access arrays.
GROUP BY in combination with aggregations is a commonly seen pattern in CrateDB use cases. Arrays have been neglected a little bit in the past and were in need of some engineering love.
Various changes to memory utilization and Lucene query execution in CrateDB contributed to performance improvements for some types of query. I have listed them below, along with the results of some basic benchmarks to give you an idea of the performance improvement in 3.1. As with any sort of benchmark, your mileage may vary...
Accessing array elements in
SELECT a FROM array_access WHERE a = 101
This ran 200x faster in our tests.
SELECT "cCode", count(*) FROM uservisits GROUP BY "cCode"
This ran 6.7x faster in our tests.
SELECT avg("adRevenue") FROM uservisits GROUP BY "cCode"
This ran 2.3x faster in our tests.
SELECT count(*) FROM (SELECT DISTINCT x FROM t) AS t ## x is long
This ran 3x faster in our tests.
WHERE NOT x = ANYqueries
We introduced a new scalar function,
ignore3vl(), which eliminates the 3-valued logic overhead of null handling if null handling is not required in your
WHERE NOT x = ANYquery logic, yielding potentially faster query results.
SELECT count(*) FROM t WHERE NOT 20 = any(a)
SELECT count(*) FROM t WHERE NOT ignore3vl(20 = any(a))
ignore3vl()in the query ran 3.8x faster than without it in our tests.
Broader PostgreSQL Wire Protocol Compatibility
CrateDB has supported the PostgreSQL wire protocol since the CrateDB 1.0 release in 2016. In version 3.1, we made a few enhancements that increase CrateDB compatibility with PostgreSQL drivers (especially the Go driver):
CrateDB does not support SQL transactions but does now return the expected responses
Timestamp columns are now encoded using
int64, which increases compatibility with different Postgres clients processing time series and other timestamp data.
Multi-query support in Simple Query Mode.
New System Metrics Monitoring Capabilities
Thread pools queues
Cluster state version (rapid change may indicate issues)
Circuit breaker statistics
Ease of Use Improvements
New administrative features that make CrateDB easier to use:
Multi-line comments in SQL are now supported.
EXPLAIN ANALYZE is now supported and reports the timing of the different phases of a query’s execution plan, including both CrateDB and Lucene phases. This can be used to optimize the performance of queries and gain a better understanding of the underlying structure of CrateDB.
Deprecating the Elasticsearch API
Because our codebase is diverging from Elasticsearch and more and more features are moved over to the CrateDB execution layer and subsequently exposed via SQL, we have decided to deprecate the Elasticsearch API. This means that in future releases the support for the Elasticsearch API can be entirely dropped.
If you are using the Elasticsearch API at the moment, please let us know on GitHub.
Want a CrateDB T-Shirt?
Do you have CrateDB feature requests or product feedback? We'd love to hear it! And if you fill out this online survey, we'll send you a free CrateDB t-shirt.