CrateDB is a SQL standard-compatible distributed database. This means each query executed on CrateDB goes through a query optimizer, which tries to find the best logical plan possible for a given query. In some situations, the query optimizer cannot generate the optimal execution plan for a given query. In cases when the performance of a query suffers from an inefficient plan, it is helpful to have more control over the query execution. In this post, we will show how the query optimizer in CrateDB can be fine-tuned for specific queries.
Let's explore this together with a real-world example. Imagine the following tables:
Now assume the following use-case: We would like to send a special offer to our cloud customers, especially those running the big clusters. We need to know the total number of nodes each customer has in its clusters and all the necessary customer details to contact them. So we need all the customer's details from the customers
table and the summary of all nodes each customer owns. Therefore we need to join customers
with clusters
and aggregate the number_of_nodes
:
This query takes over 300 ms on my M1 Macbook with a single node setup with 150k rows for customers
and 100k rows for clusters
.
Let's investigate and see if we can improve the performance. First, we will have a look into the execution plan using the EXPLAIN
command:
Collect[doc.customers]
and Collect[doc.clusters]
collect data from the tables customers
and clusters
.HashJoin[(customer_id = id)]
performs a hash-join on the rows from the underlying collect operators with the join condition customer_id = id
.Eval
operator will pick the values [id, name, address, email, country, number_of_nodes, customer_id]
from the underlying source and return them in the described order.GroupHashAggregate[id, name | sum(number_of_nodes)]
aggregates the values of number_of_nodes
for each row.OrderBy[sum(number_of_nodes) DESC]
will apply to descend ordering on the aggregation value.Eval
operator will rearrange the order and naming of the result tuple.So how can this query plan be improved? A common case where the optimizer may be wrong is the order in which tables are joined together. The optimizer usually places the smaller table on the right side of a hash-join because, in most cases, this improves performance, and exactly this happened here. However, one exception is when the data volume of the smaller table exceeds the larger table. Let's investigate further into this. We can gather more information about the two tables customers
and clusters
from sys.shards
:
This indicates that the table customers
with fewer rows has a larger data volume than clusters
, and this means the chosen join-order by the optimizer is not ideal for a hash-join. There are statistics available in pg_catalog.pg_stats
for more details:
Now we can see exactly how the data volume is distributed over the columns of the tables. So for these problematic cases, there is the option to disable join-reordering and rely exactly on the order of the tables as described in the query. The join-reordering for the different join operators CrateDB supports can be disabled using the following session settings:
Please be aware that these settings are experimental and may change in the future. We have in our query only a hash-join operator. Therefore, it is sufficient to disable the join-reordering only for those. Let’s give this a try and check the new execution plan:
As expected, the order of the Collect operators remains now exactly as described in the original query. To evaluate the performance of these two variants properly, we should run a benchmark. We can use the cr8 CrateDB command line tools for that purpose. The run-spec functionality of cr8 runs crate-benchmarks and helps to evaluate the results:
This looks good! Execution time for this query improved on average from 324ms to 180ms. So, preventing the optimizer from re-arranging the order of the joined tables improved the performance by 80%.
In this post, we covered how the performance of a query containing a join and an aggregation can be improved by fixating the join order. We usually recommend relying on the optimizer, but in certain cases, it’s worth investigating further. We also currently improving our query optimizer, moving to a cost-based approach to make the query optimization better.
Are you interested in CrateDB? Have a look at the Documentation. If you have questions, check out our CrateDB Community. We're happy to help!