"

Chapter 6: The Normal Distribution and The Central Limit Theorem

Introduction to Chapter 6: The Normal Distribution and The Central Limit Theorem

This photo shows many different pairs of shoes in various colors. The shoes appear to be hanging from a wall by cords.
Figure 1a. If you ask enough people about their shoe size, you will find that your graphed data is shaped like a bell curve and can be described as normally distributed. (credit: Ömer Ünlϋ)

The normal distribution, a continuous distribution, is the most important of all the distributions. It is widely used and even more widely abused. Its graph is bell-shaped. You see the bell curve in almost all disciplines. Some of these include psychology, business, economics, the sciences, nursing, and, of course, mathematics. Some of your instructors may use the normal distribution to help determine your grade. Most IQ scores are normally distributed. Often real-estate prices fit a normal distribution. The normal distribution is extremely important, but it cannot be applied to everything in the real world.

In this chapter, you will first study the normal distribution, the standard normal distribution, and applications associated with them.

After learning about the normal distribution, we will do more studying of the mean (the average). Why are we so concerned with mean in statistics? Two reasons: (1) it gives us a middle ground for comparison, and (2) it is easy to calculate. In this chapter, you will study the mean and the central limit theorem.

Coins and keys in a pile. There appear to be 5 pennies, 3 quarters, 4 dimes, and 2 nickels. The key ring has a bronze whale on it and holds 11 keys.
Figure 1b. If you want to figure out the distribution of the change people carry in their pockets, using the central limit theorem and assuming your sample is large enough, you will find that the distribution is normal and bell-shaped. (credit: John Lodder)

The central limit theorem (clt for short) is one of the most powerful and useful ideas in all of statistics. There are two alternative forms of the theorem, and both alternatives are concerned with drawing finite samples size n from a population with a known mean, μ, and a known standard deviation, σ. The first alternative says that if we collect samples of size n with a "large enough n," calculate each sample's mean, and create a histogram of those means, then the resulting histogram will tend to have an approximate normal bell shape. The second alternative says that if we again collect samples of size n that are "large enough," calculate the sum of each sample and create a histogram, then the resulting histogram will again tend to have a normal bell-shape.

In either case, it does not matter what the distribution of the original population is, or whether you even need to know it. The important fact is that the distribution of sample means and the sums tend to follow the normal distribution.

The size of the sample, n, that is required in order to be "large enough" depends on the original population from which the samples are drawn (the sample size should be at least 30 or the data should come from a normal distribution). If the original population is far from normal, then more observations are needed for the sample means or sums to be normal. Sampling is done with replacement. Given simple random samples of size n from a given population with a measured characteristic such as mean, proportion, or standard deviation for each sample, the probability distribution of all the measured characteristics is called a sampling distribution.

It would be difficult to overstate the importance of the central limit theorem in statistical theory. Knowing that data, even if its distribution is not normal, behaves in a predictable way is a powerful tool.

 

"Traditional Statistics" versus Technology Note

The traditional method of calculating probabilities for the normal distribution and the central limit theorem using by-hand methods with formulas and statistical tables is covered in the main sections of the text.

Added sections titled Using Technology are provided at the end of the chapter that provide instructions for calculating normal distribution and the central limit theorem probabilities by using the TI-83+ and TI-84 calculators and other technology tools.

 

Collaborative Exercises

Sections 6.1 & 6.2

Collaborative Exercise

Your instructor will record the heights of both men and women in your class, separately. Draw histograms of your data. Then draw a smooth curve through each histogram. Is each curve somewhat bell-shaped? Do you think that if you had recorded 200 data values for men and 200 for women that the curves would look bell-shaped? Calculate the mean for each data set. Write the means on the x-axis of the appropriate graph below the peak. Shade the approximate area that represents the probability that one randomly chosen male is taller than 72 inches. Shade the approximate area that represents the probability that one randomly chosen female is shorter than 60 inches. If the total area under each curve is one, does either probability appear to be more than 0.5?

 

Section 6.3

Collaborative Exercise

Suppose eight of you roll one fair die ten times, seven of you roll two fair dice ten times, nine of you roll five fair dice ten times, and 11 of you roll ten fair dice ten times.

Each time a person rolls more than one die, he or she calculates the sample mean of the faces showing. For example, one person might roll five fair dice and get 2, 2, 3, 4, 6 on one roll.

The mean is [latex]\frac{\text{2 + 2 + 3 + 4 + 6}}{5}=3.4[/latex]. The 3.4 is one mean when five fair dice are rolled. This same person would roll the five dice nine more times and calculate nine more means for a total of ten means.

Your instructor will pass out the dice to several people. Roll your dice ten times. For each roll, record the faces, and find the mean. Round to the nearest 0.5.

Your instructor (and possibly you) will produce one graph (it might be a histogram) for one die, one graph for two dice, one graph for five dice, and one graph for ten dice. Since the "mean" when you roll one die is just the face on the die, what distribution do these means appear to be representing?

Draw the graph for the means using two dice. Do the sample means show any kind of pattern?

Draw the graph for the means using five dice. Do you see any pattern emerging?

Finally, draw the graph for the means using ten dice. Do you see any pattern to the graph? What can you conclude as you increase the number of dice?

As the number of dice rolled increases from one to two to five to ten, the following is happening:

  1. The mean of the sample means remains approximately the same.
  2. The spread of the sample means (the standard deviation of the sample means) gets smaller.
  3. The graph appears steeper and thinner.

You have just demonstrated the central limit theorem (clt).

The central limit theorem tells you that as you increase the number of dice, the sample means tend toward a normal distribution (the sampling distribution).

 

definition

License

Icon for the Creative Commons Attribution-ShareAlike 4.0 International License

Introductory Statistics Copyright © 2024 by LOUIS: The Louisiana Library Network is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License, except where otherwise noted.