lucene - w3toppers.com

Hibernate Search: How to use wildcards correctly?

Updated answer for Hibernate Search 6 Short answer: don’t use wildcard queries, use a custom analyzer with an EdgeNGramFilterFactory. Also, don’t try to analyze the query yourself (that’s what you did by splitting the query into terms): Lucene will do it much better (with a WhitespaceTokenizerFactory, an ASCIIFoldingFilterFactory and a LowercaseFilterFactory in particular). Long answer: … Read more

JavaFX TextField Auto-suggestions

Here is my solution based on This. public class AutocompletionlTextField extends TextFieldWithLengthLimit { //Local variables //entries to autocomplete private final SortedSet<String> entries; //popup GUI private ContextMenu entriesPopup; public AutocompletionlTextField() { super(); this.entries = new TreeSet<>(); this.entriesPopup = new ContextMenu(); setListner(); } /** * wrapper for default constructor with setting of “TextFieldWithLengthLimit” LengthLimit * * @param … Read more

How to get a Token from a Lucene TokenStream?

Yeah, it’s a little convoluted (compared to the good ol’ way), but this should do it: TokenStream tokenStream = analyzer.tokenStream(fieldName, reader); OffsetAttribute offsetAttribute = tokenStream.getAttribute(OffsetAttribute.class); TermAttribute termAttribute = tokenStream.getAttribute(TermAttribute.class); while (tokenStream.incrementToken()) { int startOffset = offsetAttribute.startOffset(); int endOffset = offsetAttribute.endOffset(); String term = termAttribute.term(); } Edit: The new way According to Donotello, TermAttribute has been … Read more

using OR and NOT in solr query

I don’t know why that doesn’t work, but this one is logically equivalent and it does work: -(myField:superneat AND -myOtherField:somethingElse) Maybe it has something to do with defining the same field twice in the query… Try asking in the solr-user group, then post back here the final answer!

Solr vs. ElasticSearch [closed]

Update Now that the question scope has been corrected, I might add something in this regard as well: There are many comparisons between Apache Solr and ElasticSearch available, so I’ll reference those I found most useful myself, i.e. covering the most important aspects: Bob Yoplait already linked kimchy’s answer to ElasticSearch, Sphinx, Lucene, Solr, Xapian. … Read more

Choosing a stand-alone full-text search server: Sphinx or SOLR? [closed]

I’ve been using Solr successfully for almost 2 years now, and have never used Sphinx, so I’m obviously biased. However, I’ll try to keep it objective by quoting the docs or other people. I’ll also take patches to my answer 🙂 Similarities: Both Solr and Sphinx satisfy all of your requirements. They’re fast and designed … Read more

ElasticSearch, Sphinx, Lucene, Solr, Xapian. Which fits for which usage? [closed]

As the creator of ElasticSearch, maybe I can give you some reasoning on why I went ahead and created it in the first place :). Using pure Lucene is challenging. There are many things that you need to take care for if you want it to really perform well, and also, its a library, so … Read more

How to wisely combine shingles and edgeNgram to provide flexible full text search?

This is an interesting use case. Here’s my take: { “settings”: { “analysis”: { “analyzer”: { “my_ngram_analyzer”: { “tokenizer”: “my_ngram_tokenizer”, “filter”: [“lowercase”] }, “my_edge_ngram_analyzer”: { “tokenizer”: “my_edge_ngram_tokenizer”, “filter”: [“lowercase”] }, “my_reverse_edge_ngram_analyzer”: { “tokenizer”: “keyword”, “filter” : [“lowercase”,”reverse”,”substring”,”reverse”] }, “lowercase_keyword”: { “type”: “custom”, “filter”: [“lowercase”], “tokenizer”: “keyword” } }, “tokenizer”: { “my_ngram_tokenizer”: { “type”: “nGram”, “min_gram”: … Read more

Comparison of full text search engine – Lucene, Sphinx, Postgresql, MySQL? [closed]

Good to see someone’s chimed in about Lucene – because I’ve no idea about that. Sphinx, on the other hand, I know quite well, so let’s see if I can be of some help. Result relevance ranking is the default. You can set up your own sorting should you wish, and give specific fields higher … Read more

Is it 2 character search possible in lucence

Yes, this is possible using wildcards. Feed your QueryParser with te*, and it will generate a query that starts for a te prefix with any suffix.