Improved search

by michal frackowiak on 10 Nov 2011 13:30


Searching a wiki is critical, especially for large sites. Although we encourage structuring your content using categories, parent-child page relations and using the ListPages module for navigation, nothing can replace the good old Search button.

Although searching seems really simple from the user's perspective, it is a real challenge to provide sufficient infrastructure to power it. Currently we need to handle over 100,000 searches per day, index 11,000,000 pages and forum threads spread over 400,000 sites. And not only we provide search functionality on individual sites, but also Wikidot-wide.

While Wikidot was quite small (you know, these early days when we had no more than 10,000 users altogether), we were using TSearch2 - which added search functionality to the SQL database itself. And it was working fine.

But data grew and pretty soon we had to separate the search engine from the database. Gabrys programmed a layer between our Wikidot application and Lucene (an awesome search library). This solution worked like a charm.

But Wikidot doubles size of its content every few months and search became a significant bottleneck. Moreover, on large sites, or when doing Wikidot-wide search it was pretty common to receive "Search failed" error due the search server being overloaded.

Enter elasticsearch. A few weeks ago we decided to completely redesign the search backend. It was clear that at this point we need a robust, scalable and efficient solution. Elasticsearch convinced us with its minimal configuration, elegant design, built-in support for multiple servers and being already used by large web projects. It took TeRq a while to integrate, but results are awesome.

We switched Wikidot to elasticsearch last week. Since then there were no more errors due to server overload. Most queries are returned within 200 milliseconds. There are still a few issues we are working on (e.g. improving language analyzers, redesigning the Wikidot-wide search), but so far elasticsearch really helps us powering your wikis properly. There were some painful moments, but we are really glad we figured out how to deal with searching.

And while the performance and reliability has been vastly improved, we still keep the same set of search features.

If you have any comments related to search, please share them with us.

Comments: 20

Add a New Comment