Last night Wikidot.com went down for two hours between 2.10AM and 4.30AM CEST. We're still analyzing the logs to see what happened. As far as we can tell it was not an external attack, but rather a fault in our own system. Our apologies to anyone who was trying to use Wikidot.com at the time. I'll have an explanation later.
The reason for the 2h downtime was that our monitoring service took two hours to notice something was wrong. This is also a first, and we'll be adding a second independent monitoring service to avoid that in the future.
Wikidot.com is seeing unusual loads and we're still investigating what happened last night. There is a risk of the service getting too slow, and/or crashing over the next hour or so, but we're working to prevent that. I'll post news as we get it.
Portfolio
To be more precise, our primary monitoring service was working fine, but a couple of SMSes are not always enough to wake us up at night. Situations like this are really rare, but indeed, we must do something about it.
Michał Frąckowiak @ Wikidot Inc.
Visit my blog at michalf.me
False alarm. It's the database - Postgres - doing something called "autovacuum", which cleans up deleted rows and makes sure things are nice and tidy. The trouble is, it also makes things nice and slow, for hours on end. This kind of database maintenance should not happen during working hours, and from now on, won't.
Portfolio
Since this is a global site, the concept of "working hours" is troublesome. Your late at night is our early evening (a terrible time to be down).
I'm generally happy with Wikidot and plan to renew my account when my first year runs out, but the level of communication still doesn't make me feel good. I'm particularly disappointed in the failure of the Pro site.
Most of our users come from the USA, so "working hours" in this case means until about midnight US east-coast time (I need to get the exact hours but there is a definite space of 3-4 hours where the service is not much used).
The pro site… was an interesting experiment but mostly ended up as a place for good ideas to die. That was a shame but we learned some important lessons. First, splitting the community into "free" and "pro" users is not constructive. Second, there is little point in asking for ideas and suggestions and bug reports if we can't handle the volume of work. Indeed, it's counter productive.
However, I'd not call this a failure: the pro site is a valuable resource and we're using it as the basis for a lot of our ongoing design work.
Suggestions where we can improve our communications are always welcome. For me, the main needs are a single reliable source of news and channel for feedback - which this blog is meant to become - and a single place to report issues and ask for help - which the new support.wikidot.com site will eventually become.
We'll always have lots of activity on lots of sites, and watching all of these will eventually become difficult, but we'll work to make this easier, and to make notifications work better.
Personally, I'm not going to use another web platform for communications: it's Wikidot.com, period.
Portfolio
I'm with mattdm on this. The pro site is a complete failure, as it was advertised as a communication tool between Pro's and devs. I think even you will have to agree that it was a bit one sided.
Pieter. I really do applaud what you are trying to do, and at the moment, appear to be doing. I really hope you keep it up.
Time will tell.
To be honest, I just realized that even though I'm a member of the pro site, I was not getting notifications. I've started watching that site and added a Watchers module to the nav:side, same as on the community site.
@Phil, we'll see how it goes. One step at a time. I've also put pages up onto the pro site, and seen them ignored. Obviously after a month or two, I think "stuff it" and just complain to myself when I have to click seven times to delete a file, or when I realize, again, that backups don't save tags, and so on.
But at the same time, if I had to go back to making web sites without Wikidot, warts and all, I'd probably just abandon the Internet and grow potatoes or something.
Portfolio
Why don't you have like 2 servers, so that when communications go down on one, all wikidot traffic is sent to the other, and vice versa. That way, you could make a change, or do something on one server, without it affecting the other.
Wikidot already uses multiple servers but it's not as good an architecture as we'd like. We will be building in more redundancy over time but there are still potential bottlenecks, principally the database. Right now the solution to that is to scale up the database backend and continue to improve and optimise database accesses.
Portfolio
I could not agree more.
And i notice that you have created a space especially for rants. i Thank you for that. i hope I'm not the only one that will post on it. :-)
So even thought i am vocally critical of wikidot i have to admit, it still the best. (and believe me when i say i have been looking for a replacement!)
I look forward to it being so much better.