Live Stream on Jan 23rd: Unlocking Real Time Insights in the Renewable Energy Sector with CrateDB

Register now
Skip to content
Blog

Bulk Inserts with unnest

This article is more than 4 years old

Table functions are functions that produce rows in the same way as a table, and with Crate's latest testing release (0.55.0) we have added support for unnest. Unnest uses arrays to generate a transient table with each array representing a column:

cr> select * from unnest([1, 2, 3], ['Arthur', 'Trillian', 'Marvin']);
+------+----------+
| col1 | col2     |
+------+----------+
|    1 | Arthur   |
|    2 | Trillian |
|    3 | Marvin   |
+------+----------+
SELECT 3 rows in set (... sec)

Bulk Inserts

To insert large amounts of data efficiently, you are recommended to use bulk inserts. With a client that does not support these or if you have no direct access to the rest endpoint you can be out of luck!

With unnest there is a way around this by using insert-by-query you can select from an 'unnested' table which results in a bulk insert internally:

cr> insert into authors (id, name) (select * from unnest([1, 2, 3], ['Arthur', 'Trillian', 'Marvin']));
INSERT OK, 3 rows affected (... sec)

As the response suggested, the three rows have now been inserted:

cr> select * from authors order by id;
+----+----------+
| id | name     |
+----+----------+
|  1 | Arthur   |
|  2 | Trillian |
|  3 | Marvin   |
+----+----------+
SELECT 3 rows in set (... sec)

The resulting performance is also comparable to bulk requests

unnest-vs-bulk-1168x746
Comparison of a single bulk request and insert via 'unnest' of 1000 rows, repeated 10 times on a single machine

We have extended the crate-version-comparison Github repository to include a way to insert with unnest and compare single-threaded performance.

Try it Out!

As an Open Source company we rely on your feedback and would be happy if you try it out and tell us how you liked it. Or even better, submit a pull request ?!