Hey all. I'm the developer of the ES-PHP client and engineer at Elastic. This thread popped up in my inbox so I thought I'd swing by and make myself available for questions.
@noeldiaz Feel free to ping me here or on twitter (@ZacharyTong) if you're having issues with Elasticsearch, would be more than happy to help out.
Obviously, I work at Elastic so I'm a bit biased. So take my opinion with a grain of salt :)
Now Jeff is covering Algolia, I'm struggling to see (apart from the pricing) why I would consider using ElasticSearch.
I don't have direct experience with Algolia, but it looks like a nice product. They have tackled a relatively narrow, but very important, problem: search + autocomplete and a suite of integrations to make implementing search relatively plug-and-play. Combined with the SaaS offering, this makes implementing basic search very easy...especially if you just want to implement a search solution for a relatively small dataset.
Because Elasticsearch is a server, not a service, you have complete control over your implementation. It strives to give sane defaults so you can get "Good" results with minimal configuration. But critically, you can tweak your queries/data to get "Great" or "Outstanding" results. It does come with the cost of a learning curve, but there are fewer limitations than closed-box solutions that you don't have control over.
Many people aren't aware of the cool, custom functionality ES is capable of. Off the top of my head, a few interesting use cases:
- Several news agencies use Elasticsearch to rank news articles such that their relevance is exponentially weighted against the age of the article. This is super important, since articles that are more than just a few hours old start losing relevance very quickly in the news business.
- Related, the Guardian gave a very cool talk about their in-house tool, Ophan, which uses ES to track engagement. It's basically a Google Analytics on steroids, allowing all their editors and journalists to monitor article activity in realtime and adjust how they promote articles through various channels.
- Hotel rental sites are using ES to sort results such that lodgings are exponentially less important as they fall away from a geo-point, but are also boosted if they have XYZ features (wifi, etc), a better star rating, or an active promotion.
- One company wrote a custom tokenizer to turn images into tokens (edges, features, etc), which allows them to do image search
- Another company is serving machine learning recommendation models out of Elasticsearch using the function_score query (by basically combining a few dozen linear functions as coefficients to their model). Results are found via regular full-text search, then the top-N results are re-sorted using the Rescore Query based on the ML model to provide personalized recommendations
- Various companies only use ES for the percolator feature, which is basically like an alerting engine. One in particular replaced a large enterprisy rules engine with ES percolator. On a plane ride once, I built a random-forest decision tree ML model using the percolator
- Innumerable companies are using ES as a log storage and analytics engine in combination with Logstash and Kibana. My largest customer has nearly a petabyte of log data in ES, and there are other clusters in the wild with more than that.
So yeah, I'd say there is still plenty about Elasticsearch that might interest you. It just depends how much control and performance you need over your data. There isn't really a right or wrong answer, just whichever tool makes the most sense for your problem :)
As an aside, Elastic recently acquired Found which provide a SaaS version of Elasticsearch. And it probably goes without saying, but since Elasticsearch is OSS (Apache 2 license) the company could go belly-up tomorrow and the software would survive on.