13 disasters for production website and their solutions

When we first went live with Pageflakes back in the year 2005,
most of us did not have experience with running a mass consumer
high volume web application on the Internet. In our first year of
operation, we went through all types of difficulties a web
application can face as it grows. Frequent problems with software,
hardware, and network were part of our daily life. We have overcome
a lot of obstacles and established ourselves as one of the top most
Web 2.0 applications in the world. From a thousand user website, we
have grown to a million user website over the years. We have learnt
how to architect a product that can withstand more than 2 million
hits per day and sudden spikes like 7 million hits on a day. We
have discovered under the hood secrets of ASP.NET 2.0 that solves
many scalability and maintainability problems. We have also gained
enough experience in choosing the right hardware and Internet
infrastructure which can make or break a high volume web
application. In this article, you will learn about 13 disasters
than can happen to any production website anytime. These real world
stories will help you prepare yourself well enough so that you do
not go through the same problems as we did. Being prepared for
these disasters upfront will save you a lot of time and money as
well as build credibility with your users.

We have gone through many disasters over the years. Some of them
are:

  1. Hard drive crashed, burned, got corrupted several times
  2. Controller malfunctions and corrupts all disks in the same
    controller
  3. RAID malfunction
  4. CPU overheated and burned out
  5. Firewall went down
  6. Remote Desktop stopped working after a patch installation
  7. Remote Desktop max connection exceeded. Cannot login anymore to
    servers
  8. Database got corrupted while we were moving the production
    database from one server to another over the network
  9. One developer deleted the production database accidentally
    while doing routine work
  10. Support crew at hosting service formatted our running
    production server instead of a corrupted server that we asked to
    format
  11. Windows got corrupted and was not working until we
    reinstalled
  12. DNS goes down
  13. Internet backbone goes down in different parts of the
    world

This article of mine explains all these disasters and gives you
the solutions:

http://www.codeproject.com/install/13disasters.asp

Please vote for the article if you like it.

10 thoughts on “13 disasters for production website and their solutions”

  1. Se statelavorando ad una grossa applicazione Web(2.0),su cui prevedete traffico molto elevato (2 milioni

  2. Se state lavorando ad una grossa applicazione Web (2.0), su cui prevedete traffico molto elevato (2 milioni

  3. Thanks for the 13 disaster issues! It really could help someone new to overcome or by pass them from very beginning 🙂

  4. Hi Omar

    glad to see you posting on your blog again 🙂

    Interesting reading is that one, thanks!

    All the best!

  5. Too bad Pageflakes sucks now. It was great until they went and added all this extra fluff crap that makes it ugly and slow now.

    Way to take a great idea and ruin it.. Keep up the crappy work….

  6. Omar,

    You are just great, sharing such valuable information. I am a fan of your blog, one request I would like to make here is that you write something similar but with ASP.net / SQL server 2005 in mind for large scalable websites

  7. Omar,

    You are just great, sharing such valuable information. I am a fan of your blog, one request I would like to make here is that you write something similar but with ASP.net / SQL server 2005 in mind for large scalable websites

Leave a Reply