Improved search

nav_first.pngFirst: blog:more-than-just-a-wiki
More than just a Wiki
Edited: 07 Jul 2009 07:57 by: pieterh
Comments: 2
Tags: who-watches-the-watchers

nav_prev.pngPrevious: blog:dark-souls
Dark Souls Wiki - yet another success story, powered by Wikidot
Edited: 11 Oct 2011 11:23 by: Squark
Comments: 7
Tags:

Last: blog:long-time-no-word
Long time no word
Edited: 13 May 2013 14:08 by: michal frackowiak
Comments: 4
Tags:
nav_last.png

Next: blog:cyber-monday-special
Cyber Monday special - get one year Pro for free
Edited: 28 Nov 2011 10:57 by: michal frackowiak
Comments: 25
Tags:
nav_next.png

by michal frackowiak
on 10 Nov 2011 13:30

3901718761_cddd8ef7f7_m_d.jpg

Searching a wiki is critical, especially for large sites. Although we encourage structuring your content using categories, parent-child page relations and using the ListPages module for navigation, nothing can replace the good old Search button.

Although searching seems really simple from the user's perspective, it is a real challenge to provide sufficient infrastructure to power it. Currently we need to handle over 100,000 searches per day, index 11,000,000 pages and forum threads spread over 400,000 sites. And not only we provide search functionality on individual sites, but also Wikidot-wide.

While Wikidot was quite small (you know, these early days when we had no more than 10,000 users altogether), we were using TSearch2 - which added search functionality to the SQL database itself. And it was working fine.

But data grew and pretty soon we had to separate the search engine from the database. Gabrys programmed a layer between our Wikidot application and Lucene (an awesome search library). This solution worked like a charm.

But Wikidot doubles size of its content every few months and search became a significant bottleneck. Moreover, on large sites, or when doing Wikidot-wide search it was pretty common to receive "Search failed" error due the search server being overloaded.

Enter elasticsearch. A few weeks ago we decided to completely redesign the search backend. It was clear that at this point we need a robust, scalable and efficient solution. Elasticsearch convinced us with its minimal configuration, elegant design, built-in support for multiple servers and being already used by large web projects. It took TeRq a while to integrate, but results are awesome.

We switched Wikidot to elasticsearch last week. Since then there were no more errors due to server overload. Most queries are returned within 200 milliseconds. There are still a few issues we are working on (e.g. improving language analyzers, redesigning the Wikidot-wide search), but so far elasticsearch really helps us powering your wikis properly. There were some painful moments, but we are really glad we figured out how to deal with searching.

And while the performance and reliability has been vastly improved, we still keep the same set of search features.

If you have any comments related to search, please share them with us.

Comments: 20

Add a New Comment

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License