Lucene: Multi-word phrases as search terms

The reason why you don’t get your documents back is that while indexing you’re using StandardAnalyzer, which converts tokens to lowercase and removes stop words. So the only term that gets indexed for your example is ‘crescent’. However, wildcard queries are not analyzed, so ‘the’ is included as mandatory part of the query. The same … Read more

Solr/Lucene Scorer

Scorer are parts of lucene Queries via the ‘weight’ query method. In short, the framework calls Query.weight(..).scorer(..) . Have a look at http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Query.html http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Weight.html http://lucene.apache.org/jva/2_4_0/api/org/apache/lucene/search/Scorer.html To use your own Query class in Solr, you’ll need to implement your own solr QueryParserPlugin that uses your own QParser that generates your previously implemented lucene Query. You then … Read more

Remove results below a certain score threshold in Solr/Lucene?

You could write your own Collector that would ignore collecting those documents that the scorer places below your threshold. Below is a simple example of this using Lucene.Net 2.9.1.2 and C#. You’ll need to modify the example if you want to keep the calculated score. using System; using System.Collections.Generic; using Lucene.Net.Index; using Lucene.Net.Search; public class … Read more

Update specific field on SOLR index

Solr does not support updating individual fields yet, but there is a JIRA issue about this (almost 3 years old as of this writing). Until this is implemented, you have to update the whole document. UPDATE: as of Solr 4+ this is implemented, here’s the documentation.

Lucene with PHP

I would recommend using Apache SOLR as your Lucene backend and connecting via web service calls from your PHP code. I’d also note that it’s easy to pick and choose components of Zend Framework for use in your application without loading the entire framework. You could use Zend_Search_Lucene in your site and forego Zend’s MVC, … Read more

“Did you mean?” feature in Lucene.net

You should look into the SpellChecker module in the contrib dir. It’s a port of Java lucene’s SpellChecker module, so its documentation should be helpful. (From the javadocs:) Example Usage: import org.apache.lucene.search.spell.SpellChecker; SpellChecker spellchecker = new SpellChecker(spellIndexDirectory); // To index a field of a user index: spellchecker.indexDictionary(new LuceneDictionary(my_lucene_reader, a_field)); // To index a file containing … Read more