Geeks With Blogs

My wife this weekend noted as she was reading CNET that WordPress had announced a vulnerability to a worm.

Well, actually, they announced that if you had skipped the last two releases, failed to do your updates, that you would be vulnerable to the worm.

The question often happens, not so much in the actual software development shops, but rather from the common user

How does this happen? They should find all the bugs before they ship!

Simply put, test the product before it ships, and don’t allow it to have bugs!

hmmm… let me think about that for a second.

A novel idea, but how do we actually do this? I’ve done professional software development for almost 20 years, and before that, self taught software development for 10 years. I should be the perfect candidate for building bug-free software, I can test it myself, I understand all the tricks.

I can’t do it, and it simply comes down to numbers.

Let me tell you a story.

About 12 years or so, as a fairly senior software tester, I was given the task of verifying a bug had been fixed in the software.

The test seemed pretty simple.

I had a field on a form, and the field was 16 characters long.

The bug was that if the first character in the form was not one of your US based letters, but rather something from a non-english language like French, that the letter would be ignored.

So, if the user typed in:

[ÅÅÅÅuter hansten]

The actual value would be:

[uter hansten    ]

This was pretty easy to test, I fired up my trusty GUI test engine, and wrote a quick little program that looped through all the valid characters and checked to see if they worked.

Being Friday afternoon, I thought giving the program the entire weekend to run would be ample time. I turned away from my test PC and back to my real machine, I suddenly paused.

I wondered, how many tests would this little program actually do?

Lets see… this math looks pretty simple.

Q: How many letters are in the alphabet that need to be tested?

A: This looks like I can safely say we need all the special letters, that was part of the problem, so 256 – 32 = 224 (this is a-Z, numbers, special characters, and the extended characters)

Q: How many characters in the field?

A: 16

Q: So, how many loops will I have to do?

A: 16 224 = 5.2 × 10 269

SCREECHING HALT… WHAT? WHAT WAS THAT NUMBER?

I happened to recall that a GOOGOL = 10 100 was a number larger  than the number of subatomic particles in the visible universe.

This number, the loops I had to do would NEVER complete by the end of the weekend, as a matter of fact, it would be quicker to count all those subatomic particles in the visible universe than to run this test.

I turned back to my test PC and canceled the test. (oh and for you code heads, I simply inspected the software and was able to reduce my test count to TWO test cases, rather than 5.2 × 10 269)

First off, I want the casual user to understand some very important things.

No software can bug free. Just look at the current software you are using to view this blog. It has many places that you can type in more than 16 characters. Every one of those places needs to be tested, but there is only so much time before the end of the universe as we know it (which is somewhere in the neighborhood of 37 years).

Secondly, even testing the software, and fixing every bug does not actually improve the quality of the software. Most bug fixing actually adds more bugs than it fixes.

Wait, what do you mean, when you fix a bug you have one less bug!

Yes, that bug may be fixed (and this is maybe, some attempts to fix a bug actually do not fix anything) but the new code to fix the bug may have bugs in itself.

I call this the (very similar to the Neutron Lifecycle in a nuclear reactor) the defect life-cycle.

Each cycle of the software starts with a fixed number of bugs.

As the software build for the next release approaches:

Current bugs exist in the software, let use call this value Bc (Bugs current)

Attempts are made to fix bugs that are known (the Bug List) and some of those are actually removed, let us call this value Pr (planned and removed)

Review of the software during the development cycle discovers unknown bugs, and these are attempted to be fixed, let us call this value Rr (reviewed and removed)

New features are added to the software, and these new features have bugs in them, let us call this value Na (New and added)

Attempts to fix bugs that are know cause additional bugs in the software, let use call this value Pa (planned and added)

Review of the software that discovers bugs, and the attempts to fix the bugs cause new bugs, let use call this Ra (reviewed and added)

The Bugs that remain in the product after a build is the result of this formula, and let us call this value Ba (Bugs after cycle)

So, for each internal build, the rate at which bugs are fixed is:

Ba = Bc – Pr – Rr + Na + Pa + Ra

So, if the Sum of the added bugs is less than Sum of the Bugs removed, then the quality of the software has improved.

Unfortunately this condition is actually rare! Most cycles of the build actually increase the number of the bugs, because the constant pressure of adding new features to the product, and attempts by the junior developers to fix bugs introduced by senior developers (who can’t be bothered to fix that old code). Often overwhelm the any bugs that are actually removed.

Please note, nowhere in this do the actual bugs found have a positive impact on the quality of the software, rather you could strongly argue that by finding bugs the team is practically guaranteed to be adding bugs and making the software worse.

How do we get out of this?

First off, make sure that you understand that software testing will never improve the quality of the software.

Actually, it is a measuring tool, a sounding rod.

If you are testing 10% of the product and you find 100 critical bugs, then it seems that your total critical bug count would be in the neighborhood of 1,000 bugs.

Make a judgement call at this point, can you, as the software company handle that support load?

Are you putting peoples lives at risk?

Many other relevant questions…

BUT at this point do not turn to your software test team and ask them… the sounding rod measures, it does not judge.

Comments on this post: Software Testing, Value and usage