LJ Maintenance - Friday, October 2, 2009 03:00-05:00 UTC/GMT

Sep 30, 2009 15:04

**EDIT**
@05:19 UTC
Maintenance work has been completed and looks good so far! Again, there's going to be some slight slowness for a little bit, but after this 1 hour period, LJ should be fast (or at least, as speedy as before) in loading pages for you.

If there are any problems that you are experiencing, whether or not you think it's related to the work we've done tonight, please head over to our support board for assistance. I personally check the comments on these lj_maintenance posts -- even if I can't comment or reply to everyone of you -- but usually after 1 day or 2, we don't check these posts. So PLEASE, if you're having problems a few days after we've posted an entry, go to our support board; it's a much better way to be heard and have your problem addressed.

---

It's that fun time that we like to call "maintenance"! It's going to be a 2 hour window and there will be periods that we have to completely shut off ALL traffic to LJ. We're not expecting that the site will be down for the whoooole 2 hours though.

This maintenance window will be from October 2nd, 03:00 - 05:00 UTC. For the rest of us not in a zone that is even remotely UTC'ish, please check out this link and choose your city/timezone. Because of when we're doing our work, for some of us, especially in the USA, it will actually be the night of October 1st.

We will definitely be updating our status page before we start work and after everything has been put back in place, so keep an eye out on status.livejournal.org. We'll also be directing traffic to our status page during those times when we have to completely shut off traffic to LJ.


  1. Adding more memory to our "global master" database servers
  2. Upgrade the OS on our firewalls

For the database memory upgrade, we'll be putting in 3 times the RAM that we currently have, as well as doing a dry run in failing over to our backup global master. While I'd like to say that 3x the RAM will make it 3x as "fast" or "better", it won't. But it will for sure give us more breathing room and it *will* allow us to handle even more connections. Which means that those timeouts we've experienced when the site is super-popular should not happen as often.

As for upgrading the OS on the firewalls, there were some "bugs" in the codebase we've been running and our vendor came out with a notice a few weeks ago that we should really really really upgrade. We haven't knowingly run into the problems that were mentioned, but we definitely need to try to prevent them from happening in the future. The upgrade on the firewalls should be super quick compared to the work on the databases, and will be done towards the end of our maintenance window.

After our work is done, the site will actually be SLOWER than before for about 1 hour. We just need a little bit of time before our Memcached servers, and the site, are back to normal.
Previous post Next post
Up