Elasticsearch 2.1: Result window is too large (index.max_result_window)

If you need deep pagination, one possible solution is to increase the value max_result_window. You can use curl to do this from your shell command line: curl -XPUT “http://localhost:9200/my_index/_settings” -H ‘Content-Type: application/json’ -d ‘{ “index” : { “max_result_window” : 500000 } }’ I did not notice increased memory usage, for values of ~ 100k.

Remove duplicate documents from a search in Elasticsearch

You could use field collapsing, group the results on the name field and set the size of the top_hits aggregator to 1. /POST http://localhost:9200/test/dedup/_search?search_type=count&pretty=true { “aggs”:{ “dedup” : { “terms”:{ “field”: “name” }, “aggs”:{ “dedup_docs”:{ “top_hits”:{ “size”:1 } } } } } } this returns: { “took” : 192, “timed_out” : false, “_shards” : { … Read more

Make elasticsearch only return certain fields?

Yep, Use a better option source filter. If you’re searching with JSON it’ll look something like this: { “_source”: [“user”, “message”, …], “query”: …, “size”: … } In ES 2.4 and earlier, you could also use the fields option to the search API: { “fields”: [“user”, “message”, …], “query”: …, “size”: … } This is … Read more