When we started building CrateDB, our main motivation was to enable SQL at scale. Now, 7 years later, CrateDB is the core technology of many IIoT use cases, demonstrating the advantages of SQL for machine-generated data settings.
To have SQL as the interface for CrateDB allows users to not only store machine-generated data, which typically comes in the form of time series but to also store metadata (which could even be full-text and geospatial data) within the same database or even table. We are pushing the limits of traditional time-series databases with the power of SQL.
We will continue to make CrateDB as SQL-complete as possible. Of course, we will keep evaluating the tradeoffs of every step we take in our development, making sure that we never compromise the efficiency and scalability of our database.
What’s in the release?
Speed increase for analytical use cases
Although CrateDB already exceeds many of the use cases of dense time-series databases, we are constantly working to improve the experience of our users in this area. With the 4.3 release, CrateDB will perform aggregations up to 70% faster in some scenarios. This is the case for both, global and grouped aggregates. The lower the cardinality of the stored data will be the higher is the performance impact.
SQL Standard and PostgreSQL compatibility to make CrateDB work with more tools in the IIoT field
As we said before, SQL is the interface to CrateDB: every functionality we expose is exposed through SQL. A couple of years ago, we decided to also support PostgreSQL —meaning that when we add something to our SQL dialect, we implement it in such a way that makes it compatible with the PostgreSQL dialect as well.
More and more users expect CrateDB to be 100% PostgreSQL-compatible. This is not the case and we are not able to work with every PostgreSQL-compatible tool out of the box, yet. CrateDB does work with many PostgreSQL-compatible tools through their native PostgreSQL connectivity. In the 4.3 release, we added quoted subscript expressions, translate, and more functions useful for working with tools like PowerBI and Qlik. In the future, we will be more specific when presenting what tools CrateDB now supports, as we continue to increase compatibility.
Usability for engineers
When developing a product, sometimes the focus is on moving fast and other times the focus is on improving things based on user feedback. To implement the latter, we introduced our Customer Advisory Board recently and did a first set of in-depth, use-case driven, technical interviews. We gathered a lot of relevant feedback. One thing that came up often was that CrateDB acted like a blackbox in some cases. To address this, we improved error handling in areas such as GROUP BYs, repository creation, and PostgreSQL compatibility. Of course, we are not done here yet.
Usability in administration
CrateDB offers a lot of freedom to administrators and engineers but this can lead to users using settings that don't improve performance. That’s why we introduced a “maximum shards per node” setting that can be changed during runtime. By being a setting that needs to be consciously changed, users will hopefully be more aware of the number of shards they’re operating with, which can avoid potential trouble.
Moreover, when using CrateDB, users can kill their own query, even when they are not the “crate” superuser.
Farewell to the Twitter tutorial
Since March 2014, the tweet importer was an essential part of the CrateDB Admin UI. With a lot of different CrateDB clusters in operation using any given version, given the frequent changes in the Twitter API in the Twitter gateway became more and more difficult to maintain. That’s why we decided to remove it in the 4.3. release. We will implement an alternative way of generating data for demo purpose soon. If you have any ideas on what you’d like to see there, tell us! (Ironically, the best way to reach us is through Twitter).
For a complete overview of the changes including breaking ones, check out the 4.3. release notes.