A brief introduction to adding full-text searching to ASP.NET projects and why I'm using Lucene.NET 27. April 2013 by Alex This post is part of a multi-part series on adding flexible search to an ASP.NET MVC 4 project. You can find all of the articles in this series here. Build Keeper.NET is a Maven-like tool for .NET. It's a component and dependency management solution, targeted at post-build analysis and documentation. Some of my customers have a very large number of builds with an even greater number of change logs and migrations documented. It's very easy to navigate to a build if you already know what you're looking for. Want to find Build 220.127.116.11 of Product X? Click, click, done. But the data becomes a little unwieldy when you don't know the version of the particular build you're looking for, especially when your build-count numbers are in the thousands. Enter searching. Users need to quickly find builds by the contents of documentation such as release notes and migration steps. "I need to know all of the builds that fix SQL Error Code 1205." or "What release did we tell the customer to delete the cache folder?" And so, Build Keeper needs full-text searching. It would be awesome to let Google solve this problem for me, but since Build Keeper is usually installed behind a corporate firewall, that solution is out of the question. I need to solve this search problem in-app. Like... a bad way to do search, man It would be very simple to offload the search algorithm to the database by using a SELECT statement with a LIKE clause: SELECT * FROM Products WHERE Description LIKE '%<search term here>%' I've used LIKE clauses in some simple projects in the past and it worked well enough for small data sets, but it has a number of problems. First, it executes slowly because DB indexes can't be utilized for this query and the entire text value has to be scanned for a match. Every record will have to be checked and this query will probably introduce a table lock. With many nvarchar(max) data points, this will be a very slow, blocking query. Secondly, LIKE-clauses use literal matching, which isn't very flexible. Specifically, it doesn't support word stemming - which is a process that finds "secretaries" as a possible match for "secretary" - nor does it support stop words (ignoring "a" "is" "the" and similar words) nor hit-ranking (ordering the results in terms of likelihood of meeting the searcher's needs). These aren't necessarily features that I need today, but I can see the possibility of their addition in the future. (Citations: http://stackoverflow.com/questions/14026286/like-query-is-locking-the-table and http://www.grok.in/notes/full-text-search/) A cooler way to search - Lucene.NET Enter Lucene.NET! Lucene is a very powerful Java search library and Lucene.NET is a .NET port. From their website: Lucene.Net has three primary goals: Maintain the existing line-by-line port from Java to C#, fully automating and commoditizing the process such that the project can easily synchronize with the Java Lucene release schedule; Maintaining the high-performance requirements expected of a first class C# search engine library; Maximize usability and power when used within the .NET runtime. To that end, it will present a highly idiomatic, carefully tailored API that takes advantage of many of the special features of the .NET runtime. The first goal is a "line-by-line port from Java to C#." This means that the API generally follows Java coding conventions, which (for a .NET developer) can make the library feel a little wonky in some places. The complexity of the problem domain also compounds the confusion. However, I like Lucene because it is *fast* and very flexible. Lucene supports hit-ranking out of the box and it can be extended to support word stemming. AND, it supports filtering out stop words, such as ignoring "a", "but", "was." Searching for a conclusion I'm not enabling stemming or stop words for Build Keeper because both of these optimizations introduce localization issues and I don't know if these features are valuable enough to implement them in every language that I will eventually support. But I like that they're options in my pocket. The biggest drawback with using Lucene is its complexity and learning curve. Luckily for you, I'm going to tell you how I added Lucene.NET to Build Keeper, which might make it a little easier for you to add it to your own projects. End Side A. Please flip over the tape and play Side B.