Can a raw Lucene index be loaded by Solr?

Success! With Pascal’s suggestion of changes to schema.xml I got it working in no time. Thanks! Here are my complete steps for anyone interested: Downloaded Solr and copied dist/apache-solr-1.4.0.war to tomcat/webapps Copied example/solr/conf to /usr/local/solr/ Copied pre-existing Lucene index files to /usr/local/solr/data/index Set solr.home to /usr/local/solr In solrconfig.xml, changed dataDir to /usr/local/solr/data (Solr looks for … Read more

Hibernate Search | ngram analyzer with minGramSize 1

Updated answer for Hibernate Search 6 With Hibernate Search 6, you can define a second analyzer, identical to your “ngram” analyzer except that it does not have an ngram filter, and assign it as the searchAnalyzer for your field: public class Hospital { // … @FullTextField(analyzer = “ngram”, searchAnalyzer = “my_analyzer_without_ngrams”) private String name = … Read more

How to implement auto suggest using Lucene’s new AnalyzingInfixSuggester API?

I’ll give you a pretty complete example that shows you how to use AnalyzingInfixSuggester. In this example we’ll pretend that we’re Amazon, and we want to autocomplete a product search field. We’ll take advantage of features of the Lucene suggestion system to implement the following: Ranked results: We will suggest the most popular matching products … Read more

How to evaluate hosted full text search solutions?

Websolr provides a cloud-based Solr with a control panel. It’s in private beta as of this writing, but you can get the service through Heroku. Another hosted Solr service is PowCloud, also in private beta, which seems to offer strong WordPress integration. SolrHQ: another beta service providing a hosted Solr solution, with Joomla and WordPress … Read more

How to control Indexing a field in lucene 4.0

Constructors taking Field.Index arguments are available, but are deprecated in 4.0, and should not be used. Instead, you should look to subclasses of Field to control how a field is indexed. StringField is the standard un-analyzed indexed field. The field is indexed is a single token. It is appropriate things like identifiers, for which you … Read more

N-gram generation from a sentence

I believe this would do what you want: import java.util.*; public class Test { public static List<String> ngrams(int n, String str) { List<String> ngrams = new ArrayList<String>(); String[] words = str.split(” “); for (int i = 0; i < words.length – n + 1; i++) ngrams.add(concat(words, i, i+n)); return ngrams; } public static String concat(String[] … Read more

Filename search with ElasticSearch

You have various problems with what you pasted: 1) Incorrect mapping When creating the index, you specify: “mappings”: { “files”: { But your type is actually file, not files. If you checked the mapping, you would see that immediately: curl -XGET ‘http://127.0.0.1:9200/files/_mapping?pretty=1’ # { # “files” : { # “files” : { # “properties” : … Read more