Honestly, the last few weeks have been really difficult for our team. Mainly because of the performance of our servers and Wikidot service itself. Performance-related tickets were opened on our Feedback site: some users could not see their own posts, others double-posted their messages, tags were taking ages to change on some sites. Moreover activities were not updated in real-time for some users and errors were being thrown when searching sites from time to time.
Such problems are not "deadly", but fixing them is time consuming and require a lot of effort to diagnose and solve.
When things are good, they are good. There are weeks or even months when there are literally no issues with our infrastructure. But when things start to break, they break all at once, or at least appear to. Some problems can create chain reactions, impacting other elements of the system.
Our first bottleneck was the database that stores user activities. When its performance dropped, it affected other component of Wikidot — queries were timing out, load on some servers was increased and available resources were limited.
One day we discovered locks in our primary database. Most likely because concurrent attempts to update the same content. Some pages (or tags on certain sites) could not be edited without a manual release of the blocking database queries. This was strange and we believe it came from our web workers (PHP) that started dying without releasing resources and closing database transactions.
I believe we have resolved most of the performance issues now. Search still fails from time to time — availability is around 99.7% over all the sites right now, which means there might be a few minutes per day when you get errors when searching sites.
We have also tuned up low level server configuration (TCP stack) to improve performance under high load and traffic.
To make things even more bizarre — a medium-size botnet launched a DDOS attack on Wikidot yesterday, opening thousands of connections to our servers hoping to bring our websites down. Just when we thought we had all performance issues fixed :-/
We have no idea what the real target was — probably one of the sites we host. Fortunately this costed us no more than 2 minutes of unavailability of one of our web servers. We were surprised to find out the attack originated in countries like Kazachstan, India, Quatar, Serbia, Indonesia, Russian Federation, Belarus, Turkey, Kyrgyzstan.
As you can see during the last few weeks our whole team has been focused on improving performance and fixing issues so that nothing would affect sites we host, nor the way you use Wikidot. Now, when most of the issues are fixed, we will take a short breath and get back to work on other pending improvements.
BTW: Wikidot T-shirts and other gadgets should be available soon! Stay tuned!
Thank you for your efforts guys. I am glad to hear you are back on top of things, and able to take a breath or two. Looking forward to seeing the gift shop up and running.
Wayne Eddy
Melbourne, Australia
LGAM Knowledge Base
Contact via Google+
Glad everything is okay. I knew some things were going wrong but I wasn't worried :)
As I write this, Wikidot seemed to have crashed badly for us in the last hour, with it disappearing off the web altogether according to Google for about 15 minutes or so. It has come back just now as I write this, but the site look wonky and images and custom css script is not loading as yet. Hope you didn't suffer another attack and I hope all my sites functions, code and images are restored soon.
And yes we have noticed problems recently, glad to hear your on to it.
We have faith that it will be all sorted out and back to normal soon.
Regards
Ye Olde
Ye Olde - Creator and Chief Admin of www.music-industrapedia.com (Global Music Industry Directory & Encyclopedia) hosted on Wikidot.
That was quick! It all seems to be back online now. All restored! Thanks Wikidot team!
Ye Olde - Creator and Chief Admin of www.music-industrapedia.com (Global Music Industry Directory & Encyclopedia) hosted on Wikidot.
Yup! It looks like it's working fine again :)
Kenneth Tsang (@jxeeno)
I think there might have been one or two short mentions of performance issues in that blog post? I'm not entirely sure, because all of my attention is now focused on this:
Awesome! :D
P.S. I don't need to say this (it is already well-known), but I will anyway - we all appreciate the work you do at Wikidot ;-) Thanks!
~ Leiger - Wikidot Community Admin - Volunteer
Wikidot: Official Documentation | Wikidot Discord server | NEW: Wikiroo, backup tool (in development)
During the weekend we experienced at least two other DDOS attacks on Wikidot. We have added a few protection layers, but indeed there were periods when Wikidot might have been temporarily unavailable for some of you. We are still working on it.
I doubt the person(s) responsible for these attacks reads this blog, but let me be clear: this is a criminal activity and pointless at the same time.
Michał Frąckowiak @ Wikidot Inc.
Visit my blog at michalf.me
It was me!!! And I'll do it again and again! Nothing can stop me! Muahahahahahaha!!!
No, not really, I ♥ Wikidot =P
Do you know if our sites were affected in traffic rank in any way?
CEO of Icon Deposit
Take a look at me via Twitter, Dribbble, and Google +
Now website speed is even more important, especially for mobile devices.
My article about healthy tea for immunity
My fotoblog Roman Best Regards