Success often comes the hard way - Mac OS X Lion, Roaringapps and Wikidot problems
A while ago I have blogged about RoaringApps, the large(st?) online collection of Mac software which tells you which apps work with the new Mac OS X Lion and which don't. The site is awesome and we knew it would be an ultimate success. Which also mean it would have a lot of traffic.
A few days ago, when it was clear that Lion would be released within the next few days, we tuned our servers, double-checked configurations and launched extra application backends to handle the expected extra traffic.
So yesterday was the day Lion was released.
Hmmm… This one actually:
And, as we expected, thousands of Mac users rushed to RoaringApps.com to check if their apps would work after the upgrade.
Although we estimated the excess traffic quite well, we did not take into account that RoaringApps was a very complex wiki, with very cool and useful features like Dock — a private application grid. But these features, when exposed to thousands of users, was a bit too much.
And this is the reason some Wikidot sites were not responsive, some files were not available, or editing was not updating pages. In a hurry we tried to re-organize our servers to keep Wikidot performing well but, honestly, it was hard for us and some of the "improvements" did not really work since our whole cluster was overloaded.
The funny thing is that when the previous Mac OS X version, Snow Leopard was released, we also survived a small cataclysm because of enormous traffic to a similar website, http://snowleopard.wikidot.com/. But then we were totally unprepared which resulted in the longest outage in the Wikidot history.
All Wikidot sites share some key infrastructure elements and often extra load on a few sites can make others perform worse too. Some issues are difficult to overcome, but we are doing our best to keep your sites safe and working!
Today we launched more backend servers and configured them properly to avoid any outages or performance losses. If there are any problems, we will try to respond immediately.