Success often comes the hard way - Mac OS X Lion, Roaringapps and Wikidot problems
A while ago I have blogged about RoaringApps, the large(st?) online collection of Mac software which tells you which apps work with the new Mac OS X Lion and which don't. The site is awesome and we knew it would be an ultimate success. Which also mean it would have a lot of traffic.
A few days ago, when it was clear that Lion would be released within the next few days, we tuned our servers, double-checked configurations and launched extra application backends to handle the expected extra traffic.
So yesterday was the day Lion was released.
Hmmm… This one actually:
And, as we expected, thousands of Mac users rushed to RoaringApps.com to check if their apps would work after the upgrade.
Although we estimated the excess traffic quite well, we did not take into account that RoaringApps was a very complex wiki, with very cool and useful features like Dock — a private application grid. But these features, when exposed to thousands of users, was a bit too much.
And this is the reason some Wikidot sites were not responsive, some files were not available, or editing was not updating pages. In a hurry we tried to re-organize our servers to keep Wikidot performing well but, honestly, it was hard for us and some of the "improvements" did not really work since our whole cluster was overloaded.
The funny thing is that when the previous Mac OS X version, Snow Leopard was released, we also survived a small cataclysm because of enormous traffic to a similar website, http://snowleopard.wikidot.com/. But then we were totally unprepared which resulted in the longest outage in the Wikidot history.
All Wikidot sites share some key infrastructure elements and often extra load on a few sites can make others perform worse too. Some issues are difficult to overcome, but we are doing our best to keep your sites safe and working!
Today we launched more backend servers and configured them properly to avoid any outages or performance losses. If there are any problems, we will try to respond immediately.
I definitely noticed a few delays! Well done on keeping Wikidot afloat.
Wikidot is becoming more well known as time goes on (I'm attempting to help promote it myself, as well). Eventually, people will understand that a "wiki" isn't just Wikipedia. It's a collaborative website, and Wikidot is the #1 best choice in the world when you need to build a wiki ;-)
~ Leiger - Wikidot Community Admin - Volunteer
Wikidot: Official Documentation | Wikidot Discord server | NEW: Wikiroo, backup tool (in development)
Once the USA was awake the performance became awful for several hours. But things are zipping along nicely today when I am sure there is still high traffic to roaringapps.
Rob Elliott - Strathpeffer, Scotland - Wikidot first line support & community admin team.
I was in a business meeting trying to show a company my artwork and my site The Icon Deposit which they were interested in, when this was going on. When I was trying to show them the site, it wasn't loading properly (Which was making me look like an a** in front of this company). They contacted me because of my website site in the first place, and when it came time to present it. It was really messing up on me!
CEO of Icon Deposit
Take a look at me via Twitter, Dribbble, and Google +
@MDesign: it is difficult to put in words how sorry I am about the whole thing. We realize that every minute someone depends on Wikidot with something important. And I know that every minute Wikidot is up and running we make someones job possible, but when it is down we fail someone.
This indeed puts a pressure on us, and literally with everything we do we put stability and reliability in the very first place. Far before implementing new features, bells and whistles.
I learned it even in early days of Wikidot, when we had a longer outage. One of our clients (a working group in a bank) called and said they have an important meeting and they need access to their wiki really badly.
Now I also know what was the problem with RoaringApps that day and I wish we realized this on-time. When users were adding apps to their docks (by adding their unique tag to the app-page), as a side effect all cached ListPages that included pages from the app category were devalidated. Which means that all tables, grids listing apps needed to be re-generated very often — which, taking traffic into account, was creating overload. Once the the load passes a certain threshold, other things start failing and increase the load on application backend even more, blurring the whole situation.
Michał Frąckowiak @ Wikidot Inc.
Visit my blog at michalf.me
So does this mean that this problem has been fixed?
Its okay Michal, I perfectly understand where you are coming from. I just hope I get a call back from this company. I'd love it if you can set up a profile on icondeposit.com, its an up and coming designer/developer community. Maybe you can add some of your work? I'll give Wikidot and Ad spot in one of the main categories if you set your profile up.
Thank you for the response,
~ Matt Gentile
CEO of Icon Deposit
Take a look at me via Twitter, Dribbble, and Google +