
Searching a wiki is critical, especially for large sites. Although we encourage structuring your content using categories, parent-child page relations and using the ListPages module for navigation, nothing can replace the good old Search button.
Although searching seems really simple from the user's perspective, it is a real challenge to provide sufficient infrastructure to power it. Currently we need to handle over 100,000 searches per day, index 11,000,000 pages and forum threads spread over 400,000 sites. And not only we provide search functionality on individual sites, but also Wikidot-wide.
While Wikidot was quite small (you know, these early days when we had no more than 10,000 users altogether), we were using TSearch2 - which added search functionality to the SQL database itself. And it was working fine.
But data grew and pretty soon we had to separate the search engine from the database. Gabrys programmed a layer between our Wikidot application and Lucene (an awesome search library). This solution worked like a charm.
But Wikidot doubles size of its content every few months and search became a significant bottleneck. Moreover, on large sites, or when doing Wikidot-wide search it was pretty common to receive "Search failed" error due the search server being overloaded.
Enter elasticsearch. A few weeks ago we decided to completely redesign the search backend. It was clear that at this point we need a robust, scalable and efficient solution. Elasticsearch convinced us with its minimal configuration, elegant design, built-in support for multiple servers and being already used by large web projects. It took TeRq a while to integrate, but results are awesome.
We switched Wikidot to elasticsearch last week. Since then there were no more errors due to server overload. Most queries are returned within 200 milliseconds. There are still a few issues we are working on (e.g. improving language analyzers, redesigning the Wikidot-wide search), but so far elasticsearch really helps us powering your wikis properly. There were some painful moments, but we are really glad we figured out how to deal with searching.
And while the performance and reliability has been vastly improved, we still keep the same set of search features.
If you have any comments related to search, please share them with us.
Searching has certainly been a lot quicker since the change to elasticsearch - and no failures either!
Rob Elliott - Strathpeffer, Scotland - Wikidot first line support & community admin team.
Yes, I can now search after my threads and posts very quickly now.(author:xyz) .. takes not so long as before!
Service is my success. My webtips:www.blender.org (Open source), Wikidot-Handbook.
Sie können fragen und mitwirken in der deutschsprachigen » User-Gemeinschaft für WikidotNutzer oder
im deutschen » Wikidot Handbuch ?
Hi there. Wikidot (blog.wikidot.com, wikidot.com, my own sites) was down for a couple minutes just now. What gives?
"Searching a wiki is critical, especially for large sites. Although we encourage structuring your content using categories, parent-child page relations and using the ListPages module for navigation, nothing can replace the good old Search button."
I actually agree 100% with the first and last sentences. As my sites grow, I have come to depend more on the "Search" button for quickly locating a particular page or set of pages. Content structure provides for sensible and coherent navigation once you are on a page…but good "search" capabilities are becoming increasingly more important for efficient access to contents, overall. This improved search is a welcome step in this regards.
Is there any perspective to a search-module now? So that Admins can shape the way the search result is presented to their users? And a search on form-field-content?
A - S I M P L E - P L A N by ARTiZEN a startingpoint for simple wikidot solutions.
Now we can safely think about extending search, yes. Would adding a search parameter to a ListPages module solve your first problem? Custom search on form fields is still tricky, but we are thinking about it.
Michał Frąckowiak @ Wikidot Inc.
Visit my blog at michalf.me
Yes, this is really a very good idea!
If I can search for
But as a parameter of a search aegument such a
Service is my success. My webtips:www.blender.org (Open source), Wikidot-Handbook.
Sie können fragen und mitwirken in der deutschsprachigen » User-Gemeinschaft für WikidotNutzer oder
im deutschen » Wikidot Handbuch ?
@Michal: Can you please elaborate? How would ListPages x Search work?
Kenneth Tsang (@jxeeno)
You could shape the "results" page with ListPages, which you could configure to accept the search term from the URL, which in turn is set by the search form module. Nothing fancy, but a good starting point.
Michał Frąckowiak @ Wikidot Inc.
Visit my blog at michalf.me
That would be cool!
Kenneth Tsang (@jxeeno)
We have added more advanced lexical analyzers, which makes searching English sites much more effective. It takes singular/plural forms into account, abbreviations, verb forms etc.
Michał Frąckowiak @ Wikidot Inc.
Visit my blog at michalf.me
Is there a possibility to add the filter that would give back the results in Cyrillic even if the word written in the search box is in Latin script? This would make my search much easier, since people usually search using Lantin script, so I had to put this blinking notice that they HAVE TO search in Cyrillic, which is, for most of them, pretty boring, because all Serbian sites have this possibility and they are not used to change the keyboard just to make a search in a site…
The other very important thing is, as the others already mentioned, to be able to decide which categories will apear in search results, because, having the private categories, they are private because we don't want that everybody knows about them, and if they appear in search results, they are not so private anymore, even though the access to them is restricted… :) Now, this problem is solved with Tsangk's csi module… (Really, excellent job, Tsangk!) But it would be nice if we had this possibility directly through Wikidot search module…
Another important thing is to be able to determine the categories that shouldn't be indexed. Right now, I think I can do that per page or per site, and both are not acceptable. I would like to be able to decide which category can be indexed by search engines, and which ones should be skipped.
If slaughterhouses had glass walls, everyone would be vegan. - Paul McCartney
Kudos for this change guys. It's clear you guys can live without me ;-).
From infrastructural point of view, it's a nice step forward, since you host another one service less. It adds a dependency on one external service though, which is quite acceptable these days (in fact for many people Wikidot is such a dependency and it works well for them as it seems :-) ).
Piotr Gabryjeluk
visit my blog
Please excuse me if this is not the best place for these questions but I still find the multitude of mini sites within wikidot very confusing. I've only just started using search seriously in one of my sites but I am seeing a couple of serious problems. I've no idea if they are connected to the change of underlying engine or not but I do not recall noticing them previously.
1) As there is no way of specifying which pages or categories should be indexed or searched, my searches are returning a lot of pages that my users should not even be aware of. This includes: my css theme page, my navigation menus, some "include" snippets that are used on several pages and others. These are all kept in separate categories but I can't find any way of hiding them from search.
2) The search results even include pages that the current user should have no access to whatever, And - even worse it includes a brief extract from such pages. This is a serious security flaw.
This is the first site I've built that may eventually become a serious commercial site. As it is, the search facility makes it look very amateur. Am I missing something or does everyone have similar problems?
Sorry to whinge - I'm feeling frustrated.
I have 2 points to this wish::
What we need perhaps is a "setup"page for the "standard" search per site where such pre-defined categories can be - äh - "predefned"!
Edit: I mean a nrew part in the site manager for such site-wide predefined and hidden filters
Service is my success. My webtips:www.blender.org (Open source), Wikidot-Handbook.
Sie können fragen und mitwirken in der deutschsprachigen » User-Gemeinschaft für WikidotNutzer oder
im deutschen » Wikidot Handbuch ?
Many thanks for your helpful comments. And, many thanks to Kenneth Tsang as well — I hadn't spotted the Advanced Search package before. I've taken a good look at how it is implemented and it's given me a good insight into how the search mechanism works. I'm now confident that, with a few small changes, this will solve all my problems.
However, it still feels like an uncomfortable compromise and I would love to see a better mechanism such as the ability to specify default search categories in site manager and/or a good merger between List Pages and Search that includes the ability to specify category filters.
I'm a lot happier now :-)
Thanks again, Bob
You are now able to define default search categories and enforce them (so that all search results will automatically be filtered to only show your defined category). This can be achieved by using my Advanced Search include:
[[include :csi:include:adv-search
|defaultCategory=categorya,category2,category3
|enforceDefault=true
|showCategory=none]]
Kenneth Tsang (@jxeeno)
Wow, great work Kenneth!
Service is my success. My webtips:www.blender.org (Open source), Wikidot-Handbook.
Sie können fragen und mitwirken in der deutschsprachigen » User-Gemeinschaft für WikidotNutzer oder
im deutschen » Wikidot Handbuch ?
Yes, great stuff. I'll definitely use this now instead of my hacked about version. I've learned a lot about how wikidot searching works from all this so it's been a really good exercise even if it was a little frustrating at first. Sorry to all if I sounded angry in my first post - I was suffering from late night frustration. However, it looks like we've all got a really useful result so thanks guys for all your help.