Updating Your Umbraco Lucene Indexes For Better Searchability

Comma separated numerical values can be found throughout Umbraco’s Lucene indexes when you use the multi node tree picker for content and media, as well as a fair few other property editors out there.

A common issue that arises is the inability to search amongst those values due to the way some Lucene analyzers not treating the comma as a separator for numerical values unlike alphanumerical values. If these values were separated by a different character, like a space, then we could easily search among it. Lucky for us, this is easy to achieve.

We need to hook into the GatheringNodeEvent of the Examine index provider and add those numerical values using a space character for separation.

Here’s what my class looks like:

public class MyApplicationStartupEvents : ApplicationEventHandler

{

protected override void ApplicationStarted(UmbracoApplicationBase umbracoApplication, ApplicationContext applicationContext)

{

ExamineManager.Instance.IndexProviderCollection["ExternalIndexer"].GatheringNodeData += ExternalIndexer_OnGatheringNodeData;

}

private void ExternalIndexer_OnGatheringNodeData(object sender, IndexingNodeDataEventArgs e)

{

if (e.Fields.ContainsKey("relatedBlogArticles"))

{

e.Fields["_relatedBlogArticles"] = e.Fields["relatedBlogArticles"].Replace(',', ' ');

}

}

}

So, the meat of the code, is in bold. The relatedBlogArticles field, is a comma separated list of content node IDs. We’re replacing those commas with spaces and adding it into the index with a different key name, in this example, it’s the same key prefixed with an underscore. I could have just overwritten the value in the index completely, but I usually don’t like to tinker with originals.

So, now, whenever the node is indexed, the original values will be indexed, as well as the space separated version. And, now you can search for individual numerical values, in this case node IDs, in the new field’s value.

If you haven’t already, you should take some time to read up on Umbraco startup events.