Live Stream on Jan 23rd: Unlocking Real Time Insights in the Renewable Energy Sector with CrateDB

Register now
Skip to content
Blog

Crate Hackathon, Day 2

This article is more than 4 years old

Admin User Interface

cratedb-admin-user-interface-1024x705

Now you can see the memory distribution and disk space of your cluster. The Coxcomb diagram provides a fast and easy overview. You might notice in the right corner of Chrome: The Chrome Extension of Conny, Christian and Manfred has been accepted by Google. Download it from the Chrome Webstore.

User Defined Functions (UDF)

Today Christoph helped Mathias to add support for Ruby UDFs as well. Below you can find an easy sample for an Ruby UDF.

 class RubyMathMin include UserDefinedScalarFunction def name 'ruby_math_min' end def evaluate(args) args.map(&:value).min end def ident FunctionIdent.new(name, Arrays.asList(DataType::LONG,vDataType::LONG)) end def info FunctionInfo.new(ident, DataType::LONG) end def dynamicFunctionResolver end def normalizeSymbol(symbol) symbol end end

cratex

cratex-1024x668

News from the OS X client: Display of results in a table, fullscreen support, health status, multiple windows and server connections. Our App Store Account is still waiting for approval.

Benchmark Suite

Benchmarks can now be named and also work on multinode features. The COPY benchmarks against the two NUC clusters showed that the refresh_interval can have a huge impact on the cluster indexing performance. The generated reports can be exposed by an http server so the user can display and comparing graphs.

NUC Performance

We built up two small NUC Clusters (2 Nodes each). NUCs are Intels latest Mini PCs. All four NUCs were configured with Intel 530 SSD drives (120GB mSATA). Two NUCs were powered by an i3 processor (and 8GB RAM), the remaining two NUCs had an i5 processor (16GB RAM). To test the performance we used a very simple data set. 80M records like below:

{"ident":1111111111111168827,"name":"VTokmgLBIACBHj1E1f6Fx","uuid":"6741eed2c09011e39733c03fd563408b"} {"ident":1111111111111123827,"name":"ckpxVvSCHaDvT5g7Rj9wL","uuid":"6741f1acc09011e39733c03fd563408b"} {"ident":1111111111111125493,"name":"gYVQQTOBk7aNzDjgNylCy","uuid":"6741f47cc09011e39733c03fd563408b"} 

Using the COPY FROM command the two i3 NUCs were able to index 31K records per second, the i5 NUCs about 38K records per second. CPU was the limiting factor. To do a quick test on query performance we did a distinct count.

SELECT< /span> count(*), 
        ident  
FROM  < /span> generated GROUP BY ident ORDER BY count(*) DESC limit 10;span> 

The results were similar: The i3 NUCs took 8,4 seconds, the i5 NUCs 6,6 seconds. The ganglia graphs showed that generally the CPU is the limiting factor and the SSDs still had enough IOPS available. We'll purchase a few more i5 NUCs and build larger cluster to do some more tests. Stay tuned!