Crate - A Scalable SQL Database

When you have to choose a database for your project, the first thing you have to decide whether you pick a SQL database or a NoSql Database.

A SQL database lets you store your data in a very structured way, supports relationships and lets you query the data with a mature and very powerful query language.

A NoSql database on the other hand lets you store your data a lot more unstructured way. However, a NoSql database is easier to scale.

Most of the time you just want a SQL database that scales really well. Fortunately there is the scalable SQL database called Crate.

What is Crate?

Put simply, Crate is a database that is based on ElasticSearch and lets you execute SQL queries on it.

Crate is very considered to make it clear that they are not just a SQL wrapper around ElasticSearch. They provide a query engine that talks directly to the ElasticSearch indices. The don’t just convert the SQL query to a ElasticSearch query and send it via HTTP to the index.

Fast, Scalable and Easy to Use

These are the adjectives crate gave themselves. Since Crate is built on top of ElasticSearch the scalability should be given.

To test how fast Crate is I’ve imported 35 million log messages into Crate. Since Crate is based on ElasticSearch, you can use some ElasticSearch specific features in your SQL queries. In my example I’ve added a full text index to the value column of the table. This improved the runtime of my queries even more.

To make use of that index (called value_ft, for value full text) I’ve used the ElasticSearch specific MATCH function built in to the Crate SQL dialect (Here you can find more infos about full text search).

SELECT *
FROM actions
WHERE MATCH(value_ft, 'jobs') USING phrase
LIMIT 100;

This query took 0.018 seconds, which is pretty fast. Query runtimy

If you want to find out if Crate is really easy to use you can either believe me that it is, or you try it out yourself (what you should do!). If you want to install it on your local machine you can take a look at their documentation. You can also go to my Github repos to find a my sample implementation.