Monday, April 11, 2005

Dubious Testing

I was reading an article: Study Finds Windows More Reliable than Linux.

One thing that caught my eye was the testing methodology:

During the test, VeriTest also initiated a series of events that broke or disabled various system services in the administrators' test environments, which remained down until they were fixed by the administrators.

and then the conclusion is that it took longer to fix the Linux system than the Windows one.

The staggering thing - to me - is the idea that systems breaking down is normal. I'm sure we must have service failures, but they're incredibly rare on my Solaris machines. In fact, so rare that I'm really having trouble trying to think of one that wasn't a direct and obvious result of hardware failure. The Linux machines seem to need a kick once in a while, but the Windows machines generate a constant steam of calls along the lines of "help - my machine's stopped working! again!".

It's not just how quickly problems can be fixed, it's how often they crop up. (Both MTBF and MTTR enter into the equation here.)

The fact that Windows gets repaired quicker may simply be a reflection of the fact that Windows admins have more practice fixing problems...

One example from personal experience. We used to have a couple of RS/6000 machines running AIX. These were astonishingly reliable. (They were a pain in the neck because, while they were really fast, most applications we ran on them had to be ported, so we had to have a dedicated person to not only do user support for those applications, but also to port and test them. So when he left we had to move the applications onto Suns, where they compiled and ran without any effort. But I digress.) So reliable, in fact, that I had to log in to them maybe once a year to do some minor housekeeping. The fact is, I got so little practice in looking after them that I was starting afresh from my manual and course notes every time, and there was a slight delay before I found the right place in SMIT (and it's not as if the AIX commands are identical to Solaris).

And, of course, if the systems in the test were running Solaris 10, the chances are that SMF would have silently fixed the problems in no time at all.

No comments: