Why is 30 the “Magic Number” for Sample Size?
It seems like whenever people learn about statistical problem solving, the sample size question comes up. Invariably, the number 30 is bandied about as a sweet spot that should get the job done. Astute learners generally want to understand why 30 seems to work. Read on to find out why.
The answer really hinges on an understanding of how confidence intervals for the standard deviation are created, and how they rely on the sample size for their accuracy: the larger the sample size, the better the accuracy of the standard deviation estimate. Here’s the formula for the upper and lower confidence limits on standard deviation:
Rather than go into a lengthy explanation of chi-squared distributions and how the formula is derived, it’s easier to visualize what’s going on. Imagine that we’re taking samples of the melting point of blue candles, and after each sample, we calculate the mean, standard deviation, and the confidence limits for the range of where the standard deviation could be at the 95% confidence level. For the sake of argument, let’s assume that we know from previous experience that the mean melting point is 100F, with a standard deviation of 3. If we start taking samples and calculating as we go, we get something like this:
With only one sample, we can’t calculate much in terms of standard deviation, but look at what happens to our best guess of the standard deviation (s) as we take each sample. It starts at 1.7, moves down to 1.3, and then jumps up to 1.9. Furthermore, look at the limits. While the lower limit isn’t changing much, the upper limit is certainly bouncing around. How long do we have to continue taking samples until the standard deviation and limits stop bouncing around?
The best way to see is to create a graph of the standard deviation and limits, calculated at each sample. Here’s the graph:
Notice how the confidence limits tend to bounce around a lot at the beginning, then they tend to calm down after awhile? This is why 30 samples is usually deemed sufficient: if we recreate our chart with some new measurements, here’s what we get:
In this case, we didn’t get so much bouncing as the first time, so we’re more confident more early on. However, it’s very hard to know beforehand how much bouncing around you’ll get, so most people stick with 30 samples, just to be sure.
But don’t just take my word for it. I’ve made an Excel Demo that you can play with. Just input your parameters, and it will calculate a sampling scenario. Pressing ‘F9′ will force Excel to choose new samples and recalculate the graph. If you’re curious about the Math and the Excel functions, just unlock the worksheet and have at look (there’s no password required–I locked the worksheet to make it simpler).
Looking for a way to calculate how many samples you need? Take a look at the software page, and see if Stats Helper is right for you.