Hibernate Search: How to use wildcards correctly?

Updated answer for Hibernate Search 6 Short answer: don’t use wildcard queries, use a custom analyzer with an EdgeNGramFilterFactory. Also, don’t try to analyze the query yourself (that’s what you did by splitting the query into terms): Lucene will do it much better (with a WhitespaceTokenizerFactory, an ASCIIFoldingFilterFactory and a LowercaseFilterFactory in particular). Long answer: … Read more

JavaFX TextField Auto-suggestions

Here is my solution based on This. public class AutocompletionlTextField extends TextFieldWithLengthLimit { //Local variables //entries to autocomplete private final SortedSet<String> entries; //popup GUI private ContextMenu entriesPopup; public AutocompletionlTextField() { super(); this.entries = new TreeSet<>(); this.entriesPopup = new ContextMenu(); setListner(); } /** * wrapper for default constructor with setting of “TextFieldWithLengthLimit” LengthLimit * * @param … Read more

How to get a Token from a Lucene TokenStream?

Yeah, it’s a little convoluted (compared to the good ol’ way), but this should do it: TokenStream tokenStream = analyzer.tokenStream(fieldName, reader); OffsetAttribute offsetAttribute = tokenStream.getAttribute(OffsetAttribute.class); TermAttribute termAttribute = tokenStream.getAttribute(TermAttribute.class); while (tokenStream.incrementToken()) { int startOffset = offsetAttribute.startOffset(); int endOffset = offsetAttribute.endOffset(); String term = termAttribute.term(); } Edit: The new way According to Donotello, TermAttribute has been … Read more

using OR and NOT in solr query

I don’t know why that doesn’t work, but this one is logically equivalent and it does work: -(myField:superneat AND -myOtherField:somethingElse) Maybe it has something to do with defining the same field twice in the query… Try asking in the solr-user group, then post back here the final answer!

Solr vs. ElasticSearch [closed]

Update Now that the question scope has been corrected, I might add something in this regard as well: There are many comparisons between Apache Solr and ElasticSearch available, so I’ll reference those I found most useful myself, i.e. covering the most important aspects: Bob Yoplait already linked kimchy’s answer to ElasticSearch, Sphinx, Lucene, Solr, Xapian. … Read more

Choosing a stand-alone full-text search server: Sphinx or SOLR? [closed]

I’ve been using Solr successfully for almost 2 years now, and have never used Sphinx, so I’m obviously biased. However, I’ll try to keep it objective by quoting the docs or other people. I’ll also take patches to my answer 🙂 Similarities: Both Solr and Sphinx satisfy all of your requirements. They’re fast and designed … Read more

How to wisely combine shingles and edgeNgram to provide flexible full text search?

This is an interesting use case. Here’s my take: { “settings”: { “analysis”: { “analyzer”: { “my_ngram_analyzer”: { “tokenizer”: “my_ngram_tokenizer”, “filter”: [“lowercase”] }, “my_edge_ngram_analyzer”: { “tokenizer”: “my_edge_ngram_tokenizer”, “filter”: [“lowercase”] }, “my_reverse_edge_ngram_analyzer”: { “tokenizer”: “keyword”, “filter” : [“lowercase”,”reverse”,”substring”,”reverse”] }, “lowercase_keyword”: { “type”: “custom”, “filter”: [“lowercase”], “tokenizer”: “keyword” } }, “tokenizer”: { “my_ngram_tokenizer”: { “type”: “nGram”, “min_gram”: … Read more