Faster, more reliable, and eco-friendly too!

nav_first.pngFirst: blog:more-than-just-a-wiki
More than just a Wiki
Edited: 1256727771|%e %b %Y by: pieterh
Comments: 2
Tags: who-watches-the-watchers

nav_prev.pngPrevious: blog:tuesday-rant-14
Tuesday Rant XIV
Edited: 1256727771|%e %b %Y by: gerdami
Comments: 52
Tags: could-not-send-email

nav_last.pngLast: blog:mp3-player
MP3 Player
Edited: 1256727771|%e %b %Y by: pieterh
Comments: 17
Tags: music-to-my-ears

nav_next.pngNext: blog:albums-template
Albums Template
Edited: 1256727771|%e %b %Y by: pieterh
Comments: 7
Tags: domo-arigato

Faster, more reliable, and eco-friendly too!
From the desk of pieterh
Via the what's-fast-green-and-ends-in-a-dot department, on 1256727478|%e %B %Y, %H:%M

This morning we finished an upgrade of Wikidot.com's database hardware to new servers that are faster than ever. This was so boring and uneventful that there was not a blip on the radar. All you will see is that Wikidot.com is now faster, that complex pages return faster, and that even under much heavier load, Wikidot.com will continue to run Fast & Smooth.

But under the covers, the story of this upgrade is fun. The database used to run on fast RAID systems of spinning magnetic hard drives (bunches of fast hard drives grouped together to be faster, and to handle failures).

Now, Wikidot.com's primary database sits on a RAID cluster of top-end Intel solid-state memory (SSD) drives. These are expensive but much faster than spinning rust, and much more reliable. We still have live replication slaves on spinning rust. SSD seems to be everywhere: two out of three of my notebooks use it. Well, you can now boast that your web site is running on pure silicon! No more moving parts, except the cooling fan.

Thanks to Michal and his team of database experts, who planned this move carefully over the last week, the migration to the new primary database was invisible and we did not have to bring down Wikidot.com for even a second. Nice work, guys!

The SSDs are Intel X25-E Extreme, one of the fastest SSD drives out there. We're seeing a 7x improvement in database response time, and a 3x improvement in page rendering time. Pages that use ListPages a lot (like this blog, with its scrollbar) have even more improvement.

As well as giving us more reliability and speed, SSDs also use less power. One of the less obvious, but real advantages of putting your website at Wikidot.com is that it uses less power than a separately hosted server. This is good for the environment as well as being so much easier.

Hope you enjoy the new faster Wikidot.com!

Comments: 13

tsangktsangk 1256728591|%e %b %Y, %H:%M %Z|agohover

Excellent! My sites load faster now!
Are there any more PLANED upgrades?


First Wikidot Wiki with a Chinese Domain! -> http://曾勁驊.info.tm/ or http://kenneth.wikidot.com

Out of your site limit? I've got 996 sites left. Ask me and I'll create one for you! Just PM me and let me know.^

^If you require an Iron Giant Template cloned site, please tell me that too!

unfold by tsangktsangk, 1256728591|%e %b %Y, %H:%M %Z|agohover
pieterhpieterh 1256729927|%e %b %Y, %H:%M %Z|agohover

More planned upgrades, yes. Moving to static HTML caching for anonymous users, and more front end servers to handle the page rendering load. We do these slowly and carefully so don't hold your breath :-)

unfold by pieterhpieterh, 1256729927|%e %b %Y, %H:%M %Z|agohover
rhombus prhombus p 1256736180|%e %b %Y, %H:%M %Z|agohover

Moving to static HTML caching for anonymous users,

what is it now, and whats the difference?

Great Job guys!!!


RPLOGO.png
last edited on 1256740783|%e %b %Y, %H:%M %Z|agohover by rhombus p + show more
unfold by rhombus prhombus p, 1256736180|%e %b %Y, %H:%M %Z|agohover
pieterhpieterh 1256736720|%e %b %Y, %H:%M %Z|agohover

There is caching now, but it still requires some application overhead. That is, fetching a cached page means hitting the application server edge. With static HTML caching, the application servers do not even see the request, which gets served by a machine running just a fast web server and serving HTML files. It's 10-100x faster and eliminates 60% or more of the load on the app servers.

We tried this when we had the Snow Leopard incident in August and have been slowly making it work. It would be for anonymous users only; the HTML pages then get updated in the background as the real pages change.

unfold by pieterhpieterh, 1256736720|%e %b %Y, %H:%M %Z|agohover
rhombus prhombus p 1256737737|%e %b %Y, %H:%M %Z|agohover

oh okay, cool!


RPLOGO.png
last edited on 1256739616|%e %b %Y, %H:%M %Z|agohover by rhombus p + show more
unfold by rhombus prhombus p, 1256737737|%e %b %Y, %H:%M %Z|agohover
pieterhpieterh 1256901629|%e %b %Y, %H:%M %Z|agohover

Wow… that signature banner is… loud. :-)

unfold by pieterhpieterh, 1256901629|%e %b %Y, %H:%M %Z|agohover
Wiki WealthWiki Wealth 1256734096|%e %b %Y, %H:%M %Z|agohover

I've noticed a big improvement. Amazing job!

unfold by Wiki WealthWiki Wealth, 1256734096|%e %b %Y, %H:%M %Z|agohover
EricTEricT 1256750646|%e %b %Y, %H:%M %Z|agohover

I've noticed it as well. Thanks very much.

unfold by EricTEricT, 1256750646|%e %b %Y, %H:%M %Z|agohover
So, layman's terms with how this transfer works?
leigerleiger 1256768820|%e %b %Y, %H:%M %Z|agohover

Just wondering how it's possible to change servers without downtime.

I'm assuming:

  1. Backup current server to the new server, so that there are two identical servers sitting there
  2. Change the domain name to point to a different IP address
  3. Wait for the magic to happen — once the new domain changes take effect, the old server is removed
unfold So, layman's terms with how this transfer works? by leigerleiger, 1256768820|%e %b %Y, %H:%M %Z|agohover
Re: So, layman's terms with how this transfer works?
suefsuef 1256779984|%e %b %Y, %H:%M %Z|agohover

I think you've probably been misled slightly by the term server. I'm sure Pieter will correct me if I've got this a bit wrong, but I don't think the processors were upgraded with this exercise, only the storage devices. In which case, the internet presence of wikidot.com would not have been affected, so no IP address issues resulting here.

As I recall, RAID disk arrays have built-in data replication so that any disk can be hot-swapped in the event of a failure without losing access to the data that was on the disk in question. Also, the DBMS is most likely mirroring the whole database continually to a separate array anyway. So if one mirror is unavailable for some reason then the DBMS can still operate on the other one. Having features such as these built into the design of your architecture then enables you to make such hardware changes seamlessly, on the fly. It's pretty clever stuff and it all happens, like you say, as if by magic.


Sue

unfold Re: So, layman's terms with how this transfer works? by suefsuef, 1256779984|%e %b %Y, %H:%M %Z|agohover
Re: So, layman's terms with how this transfer works?
pieterhpieterh 1256805334|%e %b %Y, %H:%M %Z|agohover

So as Michal explained: this is a back-end issue, it does not affect the front-end servers (which serve the Wikidot.com domain), and although there are IP address changes, they are internal to the cluster.

The real magic is to create multiple parallel databases and switch in real time from old to new, without losing replication, or current connections. Michal says it was "simple" but that is deceptive. This is the kind of operation that takes real skill and experience. Then, of course, it's simple.

One more reason why we're all happy that Wikidot.com exists, it takes over all the stress of managing our data. For me as a user, when I see this kind of smooth upgrade, I'm more confident than ever that everything else (such as replication) is also done properly and that in the event of a disaster (meteor hitting the data center, or whatever), I won't lose my sites.

unfold Re: So, layman's terms with how this transfer works? by pieterhpieterh, 1256805334|%e %b %Y, %H:%M %Z|agohover
michal frackowiakmichal frackowiak 1256797091|%e %b %Y, %H:%M %Z|agohover

In fact we moved the whole database (i.e. database servers) to new boxes, and for transition to be successful l we had to replicate the databases live, and once they were synced, we simply pointed application to the new servers. The front-end (PHP application) servers remained as they were.

The thing is when you deals with tons of gigs of data, and the service (Wikidot.com) needs to be available without any outages, simple copying of data is not the way. In fact it would take us down for about 5 hours. Live syncing and instantaneous switchover removed this problem.

We will now dispose the old database servers - they were good a year ago, but not sufficient any more as Wikidot grows, and new (cool) hardware is available at good prices.


Michał Frąckowiak @ Wikidot Inc.
Visit my blog at michalf.me

unfold by michal frackowiakmichal frackowiak, 1256797091|%e %b %Y, %H:%M %Z|agohover
BrunhildaBrunhilda 1256900954|%e %b %Y, %H:%M %Z|agohover

Maybe this info should be inserted in a Wikidot article at Wikipedia, with this thread as reference… I would do that, but guys, I don't understand a single word Pieter wrote here, except that Wikidot is now faster, so maybe it would be better that someone else writes it….


The trouble with the world is that the stupid are cocksure and the intelligent are full of doubt. Bertrand Russell

last edited on 1256900970|%e %b %Y, %H:%M %Z|agohover by Brunhilda + show more
unfold by BrunhildaBrunhilda, 1256900954|%e %b %Y, %H:%M %Z|agohover
Add a new comment

page_revision: 2, last_edited: 1256727771|%e %b %Y, %H:%M %Z (%O ago)
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License