About a year ago in one of our December blog posts we were aiming to keep downtime under 5 minutes per month and we have finally achieved it. To keep our reliability so high we are still improving our code, infrastructure and monitoring services.
During the last year we used many techniques and involved a few software solutions that together made our service more reliable, available and stable for both anonymous and logged-in users.
As we described just after introducing the solution, we let Varnish cache pages for anonymous users. This means the application servers do less work and the service is less loaded overall.
Application server is the heart of our infrastructure. It processes all our user requests and responses to them so it is very important to keep its performance high. This is why we regurarly extend our server inventory with new dedicated multi-threaded multiprocessor servers with lots of memory and CPU power to allow more requests to be processed by PHP scripts and to make it run faster. We can sleep well because the processing power we have is way more than we need.
As fetching data from database takes some time (even when keeping the files on super-fast solid-state disks), we do a lot of caching in our code. Recently we reviewed parts of the code that generate the biggest number of queries to the database and applied robust Memcached-based caching to it. This keeps lower server response time.
Usually before a problem appears, there is some sign of it coming up, like improved resource usage, high traffic comming on or just CPU on servers using all their power to generate the pages for users. Usually there's time to react before the problem affects users sites. This we improved and extended monitoring services.
We're monitoring not only server resourses, but also Wikidot actions, like number of accounts creates, files uploaded or pages edited. Having any of those values dropped to 0 means a serious problem. Having them way too high usually means we have some spammers trying to abuse Wikidot. In any of these cases monitoring helps to keep the service available for legitimate (and paying) users.
We are also planning to change database server around February to a faster machine with new database software: PostgreSQL 9 which greatly simplify and improve database replication.
Good to know. With the large number of business and gaming sites hosted on your servers, I'm sure plenty of people appreciate the time and effort you all put in to keeping things running.
Thanks :)
~ Leiger - Wikidot Community Admin - Volunteer
Wikidot: Official Documentation | Wikidot Discord server | NEW: Wikiroo, backup tool (in development)
So recently, I developed a way to nest modules (such as ListPages module) within each other. I used a new technique with a "hybrid module" (as named by Wikidot).
A few days ago, I fell with despair when Wikidot decided to "fix" a bug which allowed me to nest modules. My hours of work went down the drain. I was planning on releasing a Social Networking platform using Wikidot coding ONLY this weekend. I have also started on building an easy blogging platform for Wikidot too - but all my work got destroyed when the "bug fix" occurred.
Now - several wikis have been broken. My hours of work (and many other's work) have gone down the drain. Without any additional programming by Wikidot - we could create nestable modules. This was a remarkable discovery and removes all limitations of modules!
The Nestable ListPages itself has enabled us to do many things never done before!
Those are just some of the possibilities for the NLP. I can imagine many complex apps achievable using pure Wikidot syntax!
I really wish Wikidot reconsider the removal of the ability for users to nest ListPages. This will open doors to a whole range of apps and tools! PLEASE PLEASE!!!!
Kenneth Tsang (@jxeeno)
Yes, this new change in the Wikidot engine has also limited some exciting app developments I had on the way. Just to name a few of them:
Please undo the recent change made! It will only enhance Wikidot in ways that were previously unimagined.
Hybrid module is internal way of smoothing up users experience (and cutting down response time). Using this feature by users was not documented, never meant to work and worked only by accident.
Piotr Gabryjeluk
visit my blog
So how about documenting it and allowing it to work on purpose? It seems that tsangk and James have some pretty good ideas brewing here.
Community Admin
It could be a security issue and therefore a pain to maintain.
Piotr Gabryjeluk
visit my blog
Well, can I propose a a new WIKI syntax for the hybrid module. Something like this:
You can restrict the modules and the parameters for the module. I don't want to write a whole wish and have it immediately rejected!
And I think the possibilities of a hybrid module outweighs all of the other issues you may put forward.
Kenneth Tsang (@jxeeno)
I second this notion.