This post will describe how we're using Elasticsearch as part of Crate. Crate is an open source data store for any data. Crate is a shared-nothing, fully searchable, document-oriented cluster store. As far as we know, we're the first to use Elasticsearch as a framework.
We were always impressed by the amazing simplicity of Elasticsearch. I began using it in 2010, to support large scale systems we built out at Lovely Systems (our systems integration company prior to founding Crate.io). Elasticsearch isn't just simple, it is also ultra fast and scales extremely easy. Click here to see Jodok talking about how we threw 24 billion records into Elasticsearch.
At the time, we needed special aggregations and other functionality for some of our projects, so, in 2011, we wrote our own plugins(1) for Elasticsearch. This made me like Java again, since Elasticsearch is written in a very direct and clean way.
When we began thinking about the ingredients of a good, modern, data store - a no-brainer data store for big data applications - we wanted to take elasticsearch as a model for simplicity and scalability.
We decided to use Elasticsearch as a framework. Elasticsearch is currently included from source as a git submodule. Unlike Elasticsearch, Crate uses Gradle as build tool, so we have our own project file for the Elasticsearch source tree.
There are two main reasons why we are including the source tree instead of just adding a dependency on the jar:
Today, Crate depends on many Elasticsearch components, such as:
We still enjoy coding against the Elasticsearch codebase and also how contributions and communications are done in this Project. There are many other things we use from Elasticsearch, feel free to take a look at source on GitHub or contact us on https://community.crate.io if you are interested in details.
Get crate now and then watch the youtube video with 1 minute installation of crate.
(1) While most of them have been project specific - some of them have been made open source like the inout https://github.com/crate/elasticsearch-inout-plugin and the timefacets plugin https://github.com/crate/elasticsearch-timefacets-plugin. The functionality of these Plugins are now included in Crate.
(2) The Elasticsearch-API can be enabled via configuration. However this might change in the future, and is currently only there for debugging purposes.