01 May, 2007

Stress Testing

An analogy of a "stress" test that should NOT be run.

Suppose your current system is a jar containing 1million beans. The mouth of the jar is small. Generally, each user expects to fetch 100 beans. The way he does it is to put his hand in and pick up one bean at a time. Thus, each user has to put his hand into the jar 100 times. If the mouth of the jar is small, only 2 or 3 users can put in their hands simultaneously. Other users have to keep waiting. It is possible that the first user is unable to rapidly put his hand into the jar 100 times because, on occassion, someone else's hand is in the jar. Therefore, some portion of his operation's time will be spent in "waiting".

Now, suppose the new system is still a jar containing 1million beans. However, now the mouth of the jar is much bigger. This means that even if each user has to put his hand in 100times, more hands can go in simultaneously. The number of users waiting to put their hands into the jar is lesser. The probability that the first user has to wait because the mouth of the jar doesn't have enough free space is lesser. He still takes the same amount of time to put in his and and remove a bean 100 times. BUT he spends *less* time waiting for the mouth of the jar to be free. However, if your "stress test" is such that the user expects to fetch 1,000 beans [instead of 100 beans], he has to put his hand into the jar 1,000 times now ! That user will say "the system is slower".


The beans in the jar is the data.
The mouth of the jar is the system capacity [throughput in terms of I/O calls because of storage bandwidth /response time , CPU calls etc that can be handled because of the number of CPUs and the speed of the CPUs]
The hand is the CPU or I/O call that the user makes.
The "single bean" is because each call will fetch a finite amount -- eg an Indexed Read will read 1 block at a time [if it has to read 1,000 datablocks, it has to make 1,000 {actually 2,000+} I/O calls].

The "stress test" should be
a) IF normally 5 users run reports, than have 10 users run reports
b) If the user fetches 100 rows, then he must still be fetching 100 rows -- if he fetches more rows than he does normally, then whatever be the system he still has to make that many more CPU or I/O calls

1 comment:

Anonymous said...

Thank you for this very interesting article.