Server problems on Sunday morning (night) - resolved

by michal-frackowiak on 11 Apr 2010 05:31

3023915939_6763823491_m_d.jpg

Earlier today, between 2-3 AM UTC time, one of our main web servers started to behave strangely. Most of its backend processes (PHP) were hanging, and the remaining could not cope with handling the incoming traffic. Technical details aside, this took Wikidot (and hosted wikis) down and the service was not available for about an hour.

For the last half the year, Wikidot.com was up for more than 99.90% of time, with exceptional february-uptime 99.994% uptime and February and 99.90% in March. Our monitoring and early-alert system prevent most incidents even before they happen. So a situation like this one is really rare.

I am personally sorry about the incident, especially that performance and availability are our priorities.

The only excuse we have is that it happened at early Sunday morning / night, which is when we do not work (usually) and need to rely on automated monitoring and notification system. We will certainly analyze this case and draw conclusions to prevent similar incidents in the future.

Michał Frąckowiak, CEO


Image by BY-YOUR-⌘

Comments: 13

Add a New Comment