We use ElasticSearch pretty heavily at work, and while there’s a few scenarios that it’s proven good at, it’s mostly become a swear word around the office.
Reasons ElasticSearch Is Frustrating
1. Writes are really, really slow
Wrting data to elasticsearch is a synchronous operation. So if you have 3 replicas, it waits for all 3 replicas to finish recording the writes before returning control the user. As the system gets busier, this means there’s a lot more network traffic, and a lot of I/O.
2. Hard to configure on the fly
After an index is created, you can’t edit it. If you want to change a type or change the replication factor, the only thing you can do is drop and recreate it, or create a new one, update the mapping and copy the data into it.
3. Expensive
We have 6 8-core machines with 32 gigabytes of ram apiece running our cluster. For the size of our data, the cluster seems comically oversized, but if anything it’s too small.
4. Very few debugging/optimization tools
Unlike SQL, where you get a nice explain plan and various tools to see, in ElasticSearch you just get a blanket “done”. Some of this may have gotten better in newer versions, and they’ve added some SQL-esque features.
5. Hard to reconfigure indexes
After you’ve created an index, if you want to change anything about it, your options are to delete it and start over or create a new one and bulk copy the data. That’s it.
6. Aggregations
Aggregations in ElasticSearch are great - possibly fantastic - if you stick within a single document and don’t want to do anything weird. As soon as you step outside those bounds, the query syntax is horrible and they slow down dramatically. In particular, we have one set that is trying to use their parent/child feature, and it’s the slowest report we have. (You know it’s a bad sign when their documentatino says “Don’t use this”.)
Things it does well:
1. Kibana
It’s is ok. Maybe even nice.
2. Faceted Search
If you have a single document type and want to quickly narrow down several disparate options, you’re in business.
So I’m actively looking for a replacement. Right now I’m just leaning towards regular PostgreSQL, probably using Aurora. I’ve considered BigQuery, because it’s fantastic, but that adds some extra complications as we’d have to move data between AWS and Google.
There are a few open-source OLAP alternatives, like ClickHouse, Snowflake or MemSQL. I haven’t really played with them much, but they look pretty interesting. ClickHouse is the only one I’ve been able to download without getting into their sales funnel, and it’s pretty nice. Biggest weakness, which I think is probably shared among most OLAP-style databases, is that it’s insert-only.
Our existing code makes pretty heavy use of constant in-place updates, which is why I’m considering standard SQL instead of most OLAP options. I also think we could do this as a two-phase optoin - once we’ve rewritten the reports to get away from ElasticSearch’s gross aggregation format and use SQL, we can evaluate speed and see if an OLAP solution would better serve our needs.