Chapter 6: The Normal Distribution and The Central Limit Theorem
6.3 The Central Limit Theorem for Sample Means (Averages)
Learning Objectives
By the end of this section, the student should be able to:
- Recognize characteristics of the Central Limit Theorem for the Sample Means
- Apply and interpret the Central Limit Theorem for the Sample Means and use it to solve real-world applications.
Applying the law of large numbers here, we could say that if you take larger and larger samples from a population, then the mean [latex]\overline{x}[/latex] of the sample tends to get closer and closer to [latex]\mu[/latex]. From the central limit theorem, we know that as n gets larger and larger, the sample means follow a normal distribution. The larger n gets, the smaller the standard deviation gets. (Remember that the standard deviation for [latex]\overline{x}[/latex] is [latex]\frac{\sigma }{\sqrt{n}}[/latex].) This means that the sample mean [latex]\overline{x}[/latex] must be close to the population mean [latex]\mu[/latex]. We can say that [latex]\mu[/latex] is the value that the sample means approach as [latex]n[/latex] gets larger. The central limit theorem illustrates the law of large numbers.
The size of the sample, [latex]n[/latex], that is required in order to be "large enough" depends on the original population from which the samples are drawn (the sample size should be at least 30 or the data should come from a normal distribution). If the original population is far from normal, then more observations are needed for the sample means or sums to be normal. Sampling is done with replacement.
The following images look at sampling distributions of the sample mean built from taking 1000 samples of different sample sizes from a normal Population. What pattern do you notice?
Four histograms. Top left - Histogram of Population; Top right - Histogram of Sampling Distribution of Sample Means when n = 5; Bottom left - Histogram of Sampling Distribution of Sample Means when n = 15; Bottom right - Histogram of Sampling Distribution of Sample Means when n = 30. All histograms follow typical bell-curve shape and as n increases, the shape gets more narrow around the mean.

The following images look at sampling distributions of the sample mean built from taking 1000 samples of different sample sizes from a non-normal Population (in this case it happens to be exponential). What pattern do you notice?
Four histograms. Top left - Histogram of exponential population; Top right - Histogram of Sampling Distribution of Sample Means when n = 5; Bottom left - Histogram of Sampling Distribution of Sample Means when n = 15; Bottom right - Histogram of Sampling Distribution of Sample Means when n = 30. All histograms are skewed right and as n increases, the plot gets more narrow around the mean.

What differences do you notice when sampling from a normal population vs. non-normal?
Example
Suppose:
- eight students roll one fair die ten times
- seven roll two fair dice ten times
- nine roll five fair dice ten times
- 11 roll ten fair dice ten times.
Each time a person rolls more than one die, he or she calculates the sample mean of the faces showing. For example, one person might roll five fair dice and get 2, 2, 3, 4, 6 on one roll.
The mean is [latex]\frac{2 + 2 + 3 + 4 + 6}{5}=3.4[/latex]. The 3.4 is one mean when five fair dice are rolled. This same person would roll the five dice nine more times and calculate nine more means for a total of ten means.
As the number of dice rolled increases from one to two to five to ten, the following would happen:
- The mean of the sample means remains approximately the same.
- The spread of the sample means (the standard deviation of the sample means) gets smaller.
- The graph appears steeper and thinner.
We have just demonstrated the idea of central limit theorem (clt) for means, that as you increase the sample size, the sampling distribution of the sample mean tends toward a normal distribution.
To summarize, the central limit theorem for sample means says that if you keep drawing larger and larger samples (such as rolling one, two, five, and finally, ten dice) and calculating their means, the sample means form their own normal distribution (the sampling distribution). The normal distribution has the same mean as the original distribution and a variance that equals the original variance divided by the sample size. Standard deviation is the square root of variance, so the standard deviation of the sampling distribution (a.k.a. standard error) is the standard deviation of the original distribution divided by the square root of [latex]n[/latex]. The variable [latex]n[/latex] is the number of values that are averaged together, not the number of times the experiment is done.
It would be difficult to overstate the importance of the central limit theorem in statistical theory. Knowing that data, even if its distribution is not normal, behaves in a predictable way is a powerful tool.
Suppose [latex]X[/latex] is a random variable with a distribution that may be known or unknown (it can be any distribution). Using a subscript that matches the random variable, suppose:
- [latex]\mu_x = \text{the mean of } X[/latex]
- [latex]\sigma_x = \text{the standard deviation of } X[/latex]
If you draw random samples of size n, then as n increases, the random variable [latex]\overline{X}[/latex] which consists of sample means, tends to be normally distributed and [latex]\overline{X} \sim N({\mu }_{x}\text{, }\frac{\sigma_x}{\sqrt{n}})[/latex].
The central limit theorem for sample means says that if you keep drawing larger and larger samples (such as rolling one, two, five, and finally, ten dice) and calculating their means, the sample means form their own normal distribution (the sampling distribution). The normal distribution has the same mean as the original distribution and a variance that equals the original variance divided by the sample size. The variable [latex]n[/latex] is the number of values that are averaged together, not the number of times the experiment is done.
To put it more formally, if you draw random samples of size [latex]n[/latex], the distribution of the random variable [latex]\overline{X}[/latex], which consists of sample means, is called the sampling distribution of the mean. The sampling distribution of the mean approaches a normal distribution as [latex]n[/latex], the sample size, increases.
The random variable [latex]\overline{X}[/latex] has a different z-score associated with it from that of the random variable X. The mean [latex]\overline{x}[/latex] is the value of [latex]\overline{X}[/latex] in one sample.
[latex]z=\frac{\overline{x}-{\mu }_{x}}{(\frac{{\sigma }_{x}}{\sqrt{n}})}[/latex]
[latex]\mu_x[/latex] is the average of both X and [latex]\overline{X}[/latex].
[latex]\sigma \overline{x} = \frac{\sigma_x}{\sqrt{n}}[/latex] = standard deviation of [latex]\overline{X}[/latex] and is called the standard error of the mean.
Using the CLT
It is important to understand when to use the central limit theorem:
- If you are being asked to find the probability of an individual value, do not use the CLT. Use the distribution of its random variable.
- If you are being asked to find the probability of the mean of a sample, then use the CLT for the mean.
The random variable [latex]\overline{X}[/latex] has a different z-score formula associated with it from that of a single observation. Remember, the mean [latex]\overline{x}[/latex] is the mean of one sample and [latex]\mu_X[/latex] is the average, or center, of both [latex]X[/latex] (The original distribution) and [latex]\overline{X}[/latex].
[latex]z=\frac{\overline{x}-{\mu }_{x}}{(\frac{{\sigma }_{x}}{\sqrt{n}})}[/latex]
We can use our Z table and standardize just as we are already familiar with (or can use your technology of choice).
Note
It is important for you to understand when to use the central limit theorem. If you are being asked to find the probability of the mean, use the clt for the mean. This also applies to percentiles for means. If you are being asked to find the probability of an individual value, do not use the clt. Use the distribution of its random variable.
Example
An unknown distribution has a mean of 90 and a standard deviation of 15. Samples of size [latex]n = 25[/latex] are drawn randomly from the population.
- Find the probability that the sample mean is between 85 and 92.
- Find [latex]P(85 \lt \overline{x} \lt 92)[/latex]. Draw a graph.
- Find the value that is two standard deviations above the expected value, 90, of the sample mean.
Solution
- Let [latex]X =[/latex] one value from the original unknown population. The probability question asks you to find a probability for the sample mean. The standard error of the mean is [latex]\frac{\sigma_x}{\sqrt{n}} = \frac{15}{\sqrt{25}} = 3[/latex]. Recall that the standard error of the mean is a description of how far (on average) that the sample mean will be from the population mean in repeated simple random samples of size [latex]n[/latex]. Let [latex]\overline{x} = \text{the mean of a sample of size 25}[/latex]. Since [latex]\sigma_X = 15[/latex], and [latex]n = 25[/latex], [latex]\overline{x} \sim N \left(90, \frac{15}{\sqrt{25}} \right)[/latex].
- This is a "between" problem. You will need to find two z scores, their corresponding probabilities, and then subtract. [latex]Z_1 = \frac{85 - 90}{\frac{15}{\sqrt{25}}} = -1.67[/latex]. [latex]Z_2 = \frac{92 - 90}{\frac{15}{\sqrt{25}}} = 0.67[/latex]. The probability that the sample mean is between 85 and 92 is [latex]0.7475 - 0.0478 = 0.6997[/latex]. The graph is below.
- To find the value that is two standard deviations above the expected value 90, use the formula:
- [latex]\text{value}= \mu_x + (\text{\#ofSTDEVs}) \left(\frac{\sigma_x}{\sqrt{n}} \right)[/latex]
- [latex]\text{value}= 90 + (2) \left(\frac{15}{\sqrt{25}} \right) = 96[/latex]
- The value that is two standard deviations above the expected value is 96.

Your Turn!
An unknown distribution has a mean of 45 and a standard deviation of eight. Samples of size [latex]n = 30[/latex] are drawn randomly from the population. Find the probability that the sample mean is between 42 and 50.
Solution
[latex]P(42 \lt \overline{x} \lt 50) = 0.9797[/latex]
Videos
Below are helpful videos for the content covered in this Section. Videos are provided from YouTube.
Section 6.3 Review
In a population whose distribution may be known or unknown, if the size ([latex]n[/latex]) of samples is sufficiently large, the distribution of the sample means will be approximately normal. The mean of the sample means will equal the population mean. The standard deviation of the distribution of the sample means, called the standard error of the mean, is equal to the population standard deviation divided by the square root of the sample size ([latex]n[/latex]).
Formula Review
- The Central Limit Theorem for Sample Means: [latex]\overline{X} \sim N \left({\mu }_{x}, \frac{\sigma_x}{\sqrt{n}}\right)[/latex]
- The Mean [latex]\overline{X}[/latex]: [latex]\mu_x[/latex]
- Central Limit Theorem for Sample Means z-score and standard error of the mean: [latex]z=\frac{\overline{x}-{\mu }_{x}}{\left(\frac{{\sigma }_{x}}{\sqrt{n}}\right)}[/latex]
- Standard Error of the Mean (Standard Deviation ([latex]\overline{X}[/latex])): [latex]\frac{{\sigma }_{x}}{\sqrt{n}}[/latex]
Section 6.3 Practice
Previously, De Anza statistics students estimated that the amount of change daytime statistics students carry is exponentially distributed with a mean of $0.88. Suppose that we randomly pick 25 daytime statistics students.
- In words, [latex]Χ = \underline{\hspace{2cm}}[/latex]
- [latex]X \sim \underline{\hspace{2cm}}(\underline{\hspace{2cm}}, \underline{\hspace{2cm}})[/latex]
- In words, [latex]\overline{X} = \underline{\hspace{2cm}}[/latex]
- [latex]\overline{X} \sim \underline{\hspace{2cm}}(\underline{\hspace{2cm}}, \underline{\hspace{2cm}})[/latex]
- Find the probability that an individual had between $0.80 and $1.00. Graph the situation, and shade in the area to be determined.
- Find the probability that the average of the 25 students was between $0.80 and $1.00. Graph the situation, and shade in the area to be determined.
- Explain why there is a difference in part e and part f.
Solution
- [latex]Χ =[/latex] amount of change students carry
- [latex]Χ \sim E(0.88, 0.88)[/latex]
- [latex]\overline{X} = \text{average amount of change carried by a sample of 25 students}[/latex]
- [latex]\overline{X} \sim N(0.88, 0.176)[/latex]
- 0.0819
- 0.1882
- The distributions are different. Part a is exponential and part b is normal.
Suppose that the distance of fly balls hit to the outfield (in baseball) is normally distributed with a mean of 250 feet and a standard deviation of 50 feet. We randomly sample 49 fly balls.
- If [latex]\overline{X}[/latex] = average distance in feet for 49 fly balls, then [latex]\overline{X} \sim \underline{\hspace{2cm}}(\underline{\hspace{2cm}}, \underline{\hspace{2cm}})[/latex]
- What is the probability that the 49 balls traveled an average of less than 240 feet? Sketch the graph. Scale the horizontal axis for [latex]\overline{X}[/latex]. Shade the region corresponding to the probability. Find the probability.
- Find the 80th percentile of the distribution of the average of 49 fly balls.
Solution
- [latex]N(250, \frac{50}{7})[/latex]
- 0.0808
- 256.01 feet
According to the Internal Revenue Service, the average length of time for an individual to complete (keep records for, learn, prepare, copy, assemble, and send) IRS Form 1040 is 10.53 hours (without any attached schedules). The distribution is unknown. Let us assume that the standard deviation is two hours. Suppose we randomly sample 36 taxpayers.
- In words, [latex]Χ = \underline{\hspace{2cm}}[/latex]
- In words, [latex]\overline{X} = \underline{\hspace{2cm}}[/latex]
- [latex]\overline{X} \sim \underline{\hspace{2cm}}(\underline{\hspace{2cm}}, \underline{\hspace{2cm}})[/latex]
- Would you be surprised if the 36 taxpayers finished their Form 1040s in an average of more than 12 hours? Explain why or why not in complete sentences.
- Would you be surprised if one taxpayer finished his or her Form 1040 in more than 12 hours? In a complete sentence, explain why.
Solution
- length of time for an individual to complete IRS form 1040, in hours.
- mean length of time for a sample of 36 taxpayers to complete IRS form 1040, in hours.
- [latex]N\left(10.53, \frac{1}{3}\right)[/latex]
- Yes. I would be surprised, because the probability is almost 0.
- No. I would not be totally surprised because the probability is 0.2312
In 1940 the average size of a U.S. farm was 174 acres. Let’s say that the standard deviation was 55 acres. Suppose we randomly survey 38 farmers from 1940.
- In words, [latex]Χ = \underline{\hspace{2cm}}[/latex]
- In words, [latex]\overline{X} = \underline{\hspace{2cm}}[/latex]
- [latex]\overline{X} \sim \underline{\hspace{2cm}}(\underline{\hspace{2cm}}, \underline{\hspace{2cm}})[/latex]
- The IQR for [latex]\overline{X}[/latex] is from [latex]\underline{\hspace{2cm}}[/latex] acres to [latex]\underline{\hspace{2cm}}[/latex] acres.
Solution
- the size of a U.S. farm in 1940
- the average size of a U.S. farm, in acres
- [latex]N(174, 8.9222)[/latex]
- 168.0, 180.0
Determine which of the following are true and which are false. Then, in complete sentences, justify your answers.
- When the sample size is large, the mean of [latex]\overline{X}[/latex] is approximately equal to the mean of [latex]Χ[/latex].
- When the sample size is large, [latex]\overline{X}[/latex] is approximately normally distributed.
- When the sample size is large, the standard deviation of [latex]\overline{X}[/latex] is approximately the same as the standard deviation of [latex]Χ[/latex].
Solution
- True. The mean of a sampling distribution of the means is approximately the mean of the data distribution.
- True. According to the Central Limit Theorem, the larger the sample, the closer the sampling distribution of the means becomes normal.
- The standard deviation of the sampling distribution of the means will decrease making it approximately the same as the standard deviation of [latex]X[/latex] as the sample size increases.
The percentage of fat calories that a person in America consumes each day is normally distributed with a mean of about 36 and a standard deviation of about ten. Suppose that 16 individuals are randomly chosen. Let [latex]\overline{X}[/latex] = average percent of fat calories.
- [latex]\overline{X} \sim \underline{\hspace{2cm}}(\underline{\hspace{2cm}}, \underline{\hspace{2cm}})[/latex]
- For the group of 16, find the probability that the average percent of fat calories consumed is more than five. Graph the situation and shade in the area to be determined.
- Find the first quartile for the average percent of fat calories.
Solution
- [latex]N(36, 2.5)[/latex]
- 1
- 34.31
The distribution of income in some Third World countries is considered wedge shaped (many very poor people, very few middle income people, and even fewer wealthy people). Suppose we pick a country with a wedge shaped distribution. Let the average salary be $2,000 per year with a standard deviation of $8,000. We randomly survey 1,000 residents of that country.
- In words, [latex]Χ = \underline{\hspace{2cm}}[/latex]
- In words, [latex]\overline{X} = \underline{\hspace{2cm}}[/latex]
- [latex]\overline{X} \sim \underline{\hspace{2cm}}(\underline{\hspace{2cm}}, \underline{\hspace{2cm}})[/latex]
- How is it possible for the standard deviation to be greater than the average?
- Why is it more likely that the average of the 1,000 residents will be from $2,000 to $2,100 than from $2,100 to $2,200?
Solution
- X = the yearly income of someone in a third world country
- the average salary from samples of 1,000 residents of a third world country
- [latex]\overline{X} \sim N \left(2000, \frac{8000}{\sqrt{1000}}\right)[/latex]
- Very wide differences in data values can have averages smaller than standard deviations.
- The distribution of the sample mean will have higher probabilities closer to the population mean.
[latex]P(2000 \lt \overline{X} \lt 2000) = 0.1537[/latex]; [latex]P(2100 \lt \overline{X} \lt 2200) = 0.1317[/latex]
Which of the following is NOT TRUE about the distribution for averages?
- The mean, median, and mode are equal.
- The area under the curve is one.
- The curve never touches the x-axis.
- The curve is skewed to the right.
Solution
The curve is skewed to the right.
The cost of unleaded gasoline in the Bay Area once followed an unknown distribution with a mean of $4.59 and a standard deviation of $0.10. Sixteen gas stations from the Bay Area are randomly chosen. We are interested in the average cost of gasoline for the 16 gas stations.
The distribution to use for the average cost of gasoline for the 16 gas stations is:
- [latex]\overline{X} \sim N(4.59, 0.10)[/latex]
- [latex]\overline{X} \sim N \left(4.59, \frac{0.10}{\sqrt{16}}\right)[/latex]
- [latex]\overline{X} \sim N \left(4.59, \frac{16}{0.10}\right)[/latex]
- [latex]\overline{X} \sim N \left(4.59, \frac{\sqrt{16}}{0.10}\right)[/latex]
Solution
b. [latex]\overline{X} \sim N \left(4.59, \frac{0.10}{\sqrt{16}}\right)[/latex]
References
Baran, Daya. “20 Percent of Americans Have Never Used Email.” WebGuild, 2010. Available online at http://www.webguild.org/20080519/20-percent-of-americans-have-never-used-email (accessed May 17, 2013).
Data from The Flurry Blog, 2013. Available online at http://blog.flurry.com (accessed May 17, 2013).
Data from the United States Department of Agriculture.
Media Attributions
- Private: 6.2.1
- 6.2.2-768×512
- Private: 6.4
As the number of trials in a probability experiment increases, the difference between the theoretical probability of an event and the relative frequency probability approaches zero.
Given a random variable (RV) with known mean [latex]\mu[/latex] and known standard deviation σ. We are sampling with size n and we are interested in two new RVs - the sample mean, [latex]\overline{X}[/latex], and the sample sum, [latex]\Sigma X[/latex].
If the size n of the sample is sufficiently large, then [latex]\overline{X}~N\left(\mu \text{,}\frac{\sigma }{\sqrt{n}}\right)[/latex] and [latex]\Sigma X~N\left(n\mu ,\sqrt{n}\sigma \right)[/latex]. If the size n of the sample is sufficiently large, then the distribution of the sample means and the distribution of the sample sums will approximate a normal distribution regardless of the shape of the population. The mean of the sample means will equal the population mean and the mean of the sample sums will equal n times the population mean. The standard deviation of the distribution of the sample means, [latex]\frac{\sigma }{\sqrt{n}}[/latex], is called the standard error of the mean.
a continuous random variable (RV) with pdf [latex]f\text{(}x\text{)}=\frac{1}{\sigma \sqrt{2\pi }}{e}^{–{\left(x–\mu \right)}^{2}/2{\sigma }^{2}}[/latex], where μ is the mean of the distribution and σ is the standard deviation, notation: X ~ N(μ,σ). If μ = 0 and σ = 1, the RV is called the standard normal distribution.
a number that describes the central tendency of the data; there are a number of specialized averages, including the arithmetic mean, weighted mean, median, mode, and geometric mean.
the standard deviation of the distribution of the sample means, or [latex]\frac{\sigma }{\sqrt{n}}[/latex].