Chapter 6: The Normal Distribution and The Central Limit Theorem
6.2 Using the Normal Distribution
Learning Objectives
By the end of this section, the student should be able to:
- Find areas under the normal distribution curve.
- Use the inverse normal to identify values for a given area under the normal curve.
- Compute probabilities and percentiles for real-world applications using the normal distribution.
Finding Normal Probabilities
The shaded area in the following graph indicates the area to the left of [latex]x[/latex]. This area is represented by the probability [latex]P(X \lt x)[/latex]. This is called the cumulative distribution function. It is calculated either by a calculator or a computer, or it is looked up in a table. Technology has made the tables virtually obsolete - but we are going to use them in this section, for you to get an understanding of it.

The area to the right is then [latex]P(X > x) = 1 - P(X \lt x)[/latex]. Remember, [latex]P(X \lt x) = \text{Area to the left of the vertical line through x}[/latex]. [latex]P(X \lt x) = P(X > x) = 1 - P(X \lt x)= \text{Area to the right of the vertical line through x}[/latex]. [latex]P(X \lt x)[/latex] is the same as [latex]P( X \le x)[/latex] and [latex]P(X > x)[/latex] is the same as [latex]P(x \ge x)[/latex] for continuous distributions.
Note
If the area to the left is 0.0228, then the area to the right is [latex]1 - 0.0228 = 0.9772[/latex].
Your Turn!
If the area to the left of [latex]x[/latex] is 0.012, then what is the area to the right?
Solution
[latex]1 - 0.012 = 0.988[/latex]
There are 3 main ways we can find probabilities associated with the Normal Distribution. These include:
- Math (via Calculus Integration)
- The Standardizing Process
- Technology
We would like to avoid complicated math if possible, so option 1 is out. Also, Calculus Integration is beyond the scope of this textbook.
In order to avoid the math, a process called "Standardizing" can be used (option 2). This involves Z-scores, the Standard Normal Distribution, and Tables. Although this tried-and-true process is now somewhat antiquated, it is a great place to start.
There are many technologies such as calculators and various statistical software that let us skip the entire standardizing process and instantaneously provide us with a probability. Although we typically have these at our disposal to use in practice, it is good to understand the process going on behind the scenes to make sure we apply our technology correctly.
The Standardizing Process
The standard normal distribution (SND) is the simplest form of the normal distribution you can think of. Recall some important facts from the previous section:
- The mean for the standard normal distribution is zero, and the standard deviation is one.
- If [latex]X[/latex] is a normally distributed random variable and [latex]X \sim N( \mu, \sigma)[/latex], then the z-score is: [latex]z=\frac{x–\mu }{\sigma }[/latex]. The transformation [latex]z = \frac{x-\mu }{\sigma }[/latex] produces the distribution [latex]Z \sim N(0, 1)[/latex]. The value x in the given equation comes from a normal distribution with mean [latex]\mu[/latex] and standard deviation [latex]\sigma[/latex].
- The z-scores are converted to units of the standard deviation.
- A z-score tells you how many standard deviations the value [latex]x[/latex] is above (to the right of) or below (to the left of) the mean, [latex]\mu[/latex]. Values of [latex]x[/latex] that are larger than the mean have positive z-scores, and values of [latex]x[/latex] that are smaller than the mean have negative z-scores. If [latex]x[/latex] equals the mean, then [latex]x[/latex] has a z-score of zero.
We have the Z table at our disposal with probabilities already calculated and organized. Note that some Z tables (like in Figure 2) give us the left tailed CDF, or "less than" probability. For example, using the Figure 6.4 below, the area to the left of a Z score of -3.37, [latex]P(Z \le -3.37) = 0.0004[/latex].

We can then use these CDF values, [latex]P(Z \le z)[/latex], and some probability rules to find greater than [latex]P(Z \ge z) = 1 - P(Z \le z)[/latex] or in between [latex]P(a \le Z \le b) = P(Z \le b) - P(Z \le a)[/latex] probabilities.
Here is a link to another Z table example: Normal Table (Source: NIST/SEMATECH e-Handbook of Statistical Methods, http://www.itl.nist.gov/div898/handbook/, January 3, 2009). This table gives positive z-scores and the area in the table is the area under the normal curve from 0 (the mean) to that z-score. Using this table takes some practice and understanding symmetry and area under the normal curve.
A few examples are provided on the table link (Source: NIST/SEMATECH e-Handbook of Statistical Methods, http://www.itl.nist.gov/div898/handbook/, January 3, 2009):
- The area to the left of a Z score of 1.53 can be found by going to 1.5 in the X column, go right to the 0.03 column to find the value 0.43699. Now add 0.5 (for the probability less than zero) to obtain the final result of 0.93699. [latex]P(Z \le 1.53) = 0.93699[/latex].
- The probability that z is less than or equal to -1.53 (also called the area to the left of -1.53) can also be found using this table. Remember that the curve is symmetrical, so [latex]P(Z \ge z) = P(Z \le -z)[/latex]. Since we found [latex]P(Z \le 1.53) = 0.93699[/latex], then [latex]P(Z \ge 1.53) = 1 - 0.93699 = 0.06301 = P(Z \le -1.53)[/latex].
- Finding probabilities (area) between values is also possible. To find the probability that z is between -1 and 0.5, look up the values for 0.5 ([latex]0.5 + 0.19146 = 0.69146[/latex] and [latex]1- (1 - (0.5 + 0.34134)) = 0.15866[/latex]). Then subtract the results ([latex]0.69146 - 0.15866[/latex]) to obtain the result 0.5328.
To use any of these Z tables with a non-standard normal distribution (either the location parameter is not 0 or the scale parameter is not 1), standardize your value by subtracting the mean and dividing the result by the standard deviation. Then look up the value for this standardized value. (Source: NIST/SEMATECH e-Handbook of Statistical Methods, http://www.itl.nist.gov/div898/handbook/, January 3, 2009)
Another table that is provided in this book is found in the Back Matter - Statistics Tables
Example
Use the Z table (Normal Table) or the one in this book found in the Back Matter - Statistics Tables, to find the following probabilities:
- [latex]P(Z \le -0.54)[/latex]
- [latex]P(Z \ge 1.2)[/latex]
- [latex]P(-1.5 \le Z \le 0.84)[/latex]
Solution
- The area between 0.54 and 0 can be found by going to 0.5 in the X column, go right to the 0.04 column to find the value 0.20540. If you add 0.5 to that value, ([latex]0.5 + 0.20540 = 0.7054[/latex]) you'll have the area all the way to the left of 0.54. [latex]P(Z \le 0.54) = 0.7054[/latex]. To find the equal negative z-score, subtract that from 1. So [latex]1 - 0.7054 = 0.2946[/latex].
- The area between 1.2 and 0 can be found by going to 1.2 in the X column, go right to the 0.00 column to find the value 0.38493. Since the curve has an area to the right of 0 that equals 0.5, then to the right of just 1.2 must be [latex]0.5 - 0.38493 = 0.11507[/latex].
- The area between 0 and 0.84 is found by going to 0.8 in the X column, go right to the 0.04 column to find the value 0.29955. The area between -1.5 and 0 is the same as the area between 1.5 and 0, which can be found on the table as a value of 0.43319. This area can be found by adding [latex]0.29955 + 0.43319 = 0.73274[/latex].
Your Turn!
Use the Z table (found in the Back Matter - Statistics Tables) to find the following probabilities:
- [latex]P(Z \le 1)[/latex]
- [latex]P(Z \ge 1)[/latex]
- [latex]P(-1 \le Z \le 1)[/latex]
Solution
- 0.8413
- 0.1587
- 0.6826
So far, we have seen the idea that we can convert any normal distribution with any mean and standard deviation to the standard normal distribution in units of z-scores. We also have the associated probabilities in our Z table. Essentially, the work has been done for us if we know how to standardize and look up the associated probability in the table. The general process is: [latex]X \sim N(\mu, \sigma) \rightarrow Z \sim N(0, 1) \rightarrow \text{Probability from Z table}[/latex]
This process, while maybe outdated in our technology age, is good for beginners to understand and useful when we do not have access to technology.
Example
Height and weight are two measurements used to track a child’s development. The World Health Organization measures child development by comparing the weights of children who are the same height and the same gender. In 2009, weights for all 80 cm girls in the reference population had a mean µ = 10.2 kg and standard deviation σ = 0.8 kg. Weights are normally distributed. X ~ N(10.2, 0.8). Calculate the z-scores that correspond to the following weights, then find the associated probabilities.
- The probability that a child weighs less than 11 kg
- The probability that a child weighs more than 7.9 kg
- The probability that a child weighs between 11.2 and 12.2 kg
Solution
- [latex]\frac{11 - 10.2}{0.8} = 1[/latex]. A child who weighs 11 kg is one standard deviation above the mean of 10.2 kg. [latex]P(Z \le 1) = 0.8413[/latex]
- [latex]\frac{7.9 - 10.2}{0.8} = -2.875[/latex]. A child who weighs 7.9 kg is 2.875 standard deviations below the mean of 10.2 kg. [latex]P(Z \ge -2.88) = 1- P(Z \le -2.88) = 1 - 0.002 = 0.998[/latex]
- [latex]z_1 = \frac{11.2 - 10.2}{0.8} = 1.25[/latex] and [latex]z_2 = \frac{12.2 - 10.2}{0.8} = 2.5[/latex]. A child who weighs 12.2 kg is 2.5 standard deviation above the mean of 10.2 kg. [latex]P( 1.25 \le Z \le 2.5) = P(Z \le 2.5) - P(Z \le1.25) = 0.9938 - 0.8944 = 0.0994[/latex]
Your Turn!
The golf scores for a school team were normally distributed with a mean of 68 and a standard deviation of three.
- Find the probability that a randomly selected golfer scored less than 65.
- Find the probability that a golfer scored between 66 and 70.
Solution
- 0.1587
- 0.4950
The “Un-Standardizing” Process
Sometimes we may be given a percentile or z-score and want to work backwards through the standardizing process to find a value on the original distribution. You could call this “un-standardizing” or finding a normal quantile. The process looks like this:
[latex]\text{Probability from Z table} \rightarrow Z \sim N(0, 1) \rightarrow X \sim N(\mu, \sigma)[/latex]
For example, if the mean of a normal distribution is five and the standard deviation is two, what value is three standard deviations above (or to the right of) the mean (z-score = three). Rearranging the z-score formula, the calculation is as follows:
[latex]x = μ + (z)(σ) = 5 + (3)(2) = 11[/latex]
Often we are given a percentile to find on the original distribution. For example, what if we want to know a value on the previous distribution that corresponds to the 90th percentile? We can look up a probability of 0.9 in the Z table and find a corresponding z-score of approximately 1.28.
[latex]x = μ + (z)(σ) = 5 + (1.28)(2) = 7.56[/latex]
Your Turn!
A citrus farmer who grows mandarin oranges finds that the diameters of mandarin oranges harvested on his farm follow a normal distribution with a mean diameter of 5.85 cm and a standard deviation of 0.24 cm.
- Find the 90th percentile for the diameters of mandarin oranges.
- The middle 20% of mandarin oranges from this farm have diameters between [latex]\underline{\hspace{2cm}}[/latex] and [latex]\underline{\hspace{2cm}}[/latex].
Solution
- 6.16
- Between 5.79 and 5.91
Videos
Below are helpful videos for the content covered in this Section. Videos are provided from YouTube
- Normal Distribution: Give an Area to Left/Right, Find the Area to the Right/Lef
- Complete a Normal Distribution Graph Given Mean and Standard Deviation
- Normal Distribution: Find a Z Score and a Data Value (General)
- Normal Distribution: Find Probabilities Given Z-scores Using Table (Left of Z-score)
- Normal Distribution: Find Probabilities Given Data Values Using Table (Left of Z-score)
- Normal Distribution: Find Probability Using With Z-scores Using Tables
- Normal Distribution: Find Probability of Data Values Using Tables
- Ex 1: Find a Z-score Given the Probability of Z Being Less Than a Given Value
- Ex 2: Find a Z-score Given the Probability of Z Being Greater Than a Given Value
Section 6.2 Review
The normal distribution, which is continuous, is the most important of all the probability distributions. Its graph is bell-shaped. This bell-shaped curve is used in almost all disciplines. Since it is a continuous distribution, the total area under the curve is one. The parameters of the normal are the mean [latex]\mu[/latex] and the standard deviation [latex]\sigma[/latex]. A special normal distribution, called the standard normal distribution is the distribution of z-scores. Its mean is zero, and its standard deviation is one.
Formula Review
- Normal Distribution: [latex]X \sim N(\mu, \sigma)[/latex] where [latex]\mu[/latex] is the mean and [latex]\sigma[/latex] is the standard deviation.
- Standard Normal Distribution: [latex]Z \sim N(0,1)[/latex].
Section 6.2 Practice
How would you represent the area to the left of one in a probability statement?

Solution
[latex]P(x \lt 1)[/latex]
What is the area to the right of one?

Solution
[latex]1 – P(x \lt 1)[/latex] or [latex]P(x \lt 1)[/latex]
Is [latex]P(x \lt 1)[/latex] equal to [latex]P(x \le 1)[/latex]? Why?
Solution
Yes, because they are the same in a continuous distribution: [latex]P(x = 1) = 0[/latex]
How would you represent the area to the left of three in a probability statement?

Solution
[latex]P(x \lt 3)[/latex]
What is the area to the right of three?

Solution
[latex]1 – P(x \lt 3)[/latex] or [latex]P(x > 3)[/latex]
If the area to the left of x in a normal distribution is 0.123, what is the area to the right of [latex]x[/latex]?
Solution
[latex]1 – 0.123 = 0.877[/latex]
If the area to the right of x in a normal distribution is 0.543, what is the area to the left of [latex]x[/latex]?
Solution
[latex]1 – 0.543 = 0.457[/latex]
[latex]X \sim N(54, 8)[/latex]
- Find the probability that [latex]x > 56[/latex].
- Find the probability that [latex]x \lt 30[/latex].
- Find the 80th percentile.
- Find the 60th percentile.
Solution
- 0.4013
- 0.0013
- 60.73
- 56.03
[latex]X \sim N(6, 2)[/latex]
Find the probability that [latex]x[/latex] is between three and nine.
Solution
0.8664
[latex]X \sim N(–3, 4)[/latex]
Find the probability that [latex]x[/latex] is between one and four.
Solution
0.1186
[latex]X \sim N(4, 5)[/latex]
Find the maximum of x in the bottom quartile.
Solution
0.6276
The life of Sunshine CD players is normally distributed with a mean of 4.1 years and a standard deviation of 1.3 years. A CD player is guaranteed for three years. We are interested in the length of time a CD player lasts.
- Find the probability that a CD player will break down during the guarantee period. It might be helpful to sketch the situation, label the axes and shade the region corresponding to the probability. Use a normal distribution curve like the one below.
- Find the probability that a CD player will last between 2.8 and six years. It might be helpful to sketch the situation, label the axes and shade the region corresponding to the probability. Use a normal distribution curve like the one below.
- Find the 70th percentile of the distribution for the time a CD player lasts. It might be helpful to sketch the situation, label the axes and shade the region corresponding to the percentile. Use a normal distribution curve like the one below.

Solution
- [latex]P(0 \lt x \lt 3) = 0.1979[/latex] (Note: Use zero for the minimum value of [latex]x[/latex].)
- [latex]P(2.8 \lt x \lt 6) = 0.7694[/latex]
- [latex]P(x \lt 0.70) = 4.78[/latex]. Therefore, [latex]k = 4.78[/latex] years.
The patient recovery time from a particular surgical procedure is normally distributed with a mean of 5.3 days and a standard deviation of 2.1 days.
- What is the probability of spending more than two days in recovery?
- 0.0580
- 0.8447
- 0.0553
- 0.9420
- The 90th percentile for recovery times is?
- 8.89
- 7.07
- 7.99
- 4.32
Solution
- 0.9420
- 7.99
- Based upon the given information and numerically justified, would you be surprised if it took less than one minute to find a parking space?
- Yes
- No
- Unable to determine
- Find the probability that it takes at least eight minutes to find a parking space.
- 0.0001
- 0.9270
- 0.1862
- 0.0668
- Seventy percent of the time, it takes more than how many minutes to find a parking space?
- 1.24
- 2.41
- 3.95
- 6.05
Solution
- Yes
- 0.0668
- 3.95
According to a study done by De Anza students, the height for Asian adult males is normally distributed with an average of 66 inches and a standard deviation of 2.5 inches. Suppose one Asian adult male is randomly chosen. Let [latex]X =[/latex] height of the individual.
- [latex]X \sim \underline{\hspace{2cm}}( \underline{\hspace{2cm}}, \underline{\hspace{2cm}})[/latex]
- Find the probability that the person is between 65 and 69 inches. Include a sketch of the graph, and write a probability statement.
- Would you expect to meet many Asian adult males over 72 inches? Explain why or why not, and justify your answer numerically.
Solution
- [latex]X \sim N(66, 2.5)[/latex]
- 0.5404
- No, the probability that an Asian male is over 72 inches tall is 0.0082
IQ is normally distributed with a mean of 100 and a standard deviation of 15. Suppose one individual is randomly chosen. Let [latex]X =[/latex] IQ of an individual.
- [latex]X \sim \underline{\hspace{2cm}}( \underline{\hspace{2cm}}, \underline{\hspace{2cm}})[/latex]
- Find the probability that the person has an IQ greater than 120. Include a sketch of the graph, and write a probability statement.
- MENSA is an organization whose members have the top 2% of all IQs. Find the minimum IQ needed to qualify for the MENSA organization. Sketch the graph, and write the probability statement.
- The middle 50% of IQs fall between what two values? Sketch the graph and write the probability statement.
Solution
- [latex]X \sim N(100, 15)[/latex]
- The probability that a person has an IQ greater than 120 is 0.0918.
- A person has to have an IQ over 130 to qualify for MENSA.
- The middle 50% of IQ scores falls between 89.95 and 110.05.
The percent of fat calories that a person in America consumes each day is normally distributed with a mean of about 36 and a standard deviation of 10. Suppose that one individual is randomly chosen. Let [latex]X =[/latex] percent of fat calories.
- [latex]X \sim \underline{\hspace{2cm}}( \underline{\hspace{2cm}}, \underline{\hspace{2cm}})[/latex]
- Find the probability that the percent of fat calories a person consumes is more than 40. Graph the situation. Shade in the area to be determined.
- Find the maximum number for the lower quarter of percent of fat calories. Sketch the graph and write the probability statement.
Solution
- [latex]X \sim N(36, 10)[/latex]
- The probability that a person consumes more than 40% of their calories as fat is 0.3446.
- Approximately 25% of people consume less than 29.26% of their calories as fat.
Suppose that the distance of fly balls hit to the outfield (in baseball) is normally distributed with a mean of 250 feet and a standard deviation of 50 feet.
- If [latex]X =[/latex] distance in feet for a fly ball, then [latex]X \sim \underline{\hspace{2cm}}( \underline{\hspace{2cm}}, \underline{\hspace{2cm}})[/latex]
- If one fly ball is randomly chosen from this distribution, what is the probability that this ball traveled fewer than 220 feet? Sketch the graph. Scale the horizontal axis X. Shade the region corresponding to the probability. Find the probability.
- Find the 80th percentile of the distribution of fly balls. Sketch the graph, and write the probability statement.
Solution
- [latex]X \sim N(250, 50)[/latex]
- The probability that a fly ball travels less than 220 feet is 0.2743.
- Eighty percent of the fly balls will travel less than 292 feet.
In China, four-year-olds average three hours a day unsupervised. Most of the unsupervised children live in rural areas, considered safe. Suppose that the standard deviation is 1.5 hours and the amount of time spent alone is normally distributed. We randomly select one Chinese four-year-old living in a rural area. We are interested in the amount of time the child spends alone per day.
- In words, define the random variable [latex]X[/latex].
- [latex]X \sim \underline{\hspace{2cm}}( \underline{\hspace{2cm}}, \underline{\hspace{2cm}})[/latex]
- Find the probability that the child spends less than one hour per day unsupervised. Sketch the graph, and write the probability statement.
- What percent of the children spend over ten hours per day unsupervised?
- Seventy percent of the children spend at least how long per day unsupervised?
Solution
- [latex]X =[/latex] number of hours that a Chinese four-year-old in a rural area is unsupervised during the day.
- [latex]X \sim N(3, 1.5)[/latex]
- The probability that the child spends less than one hour a day unsupervised is 0.0918.
- The probability that a child spends over ten hours a day unsupervised is less than 0.0001.
- 2.21 hours
In the 1992 presidential election, Alaska’s 40 election districts averaged 1,956.8 votes per district for President Clinton. The standard deviation was 572.3. (There are only 40 election districts in Alaska.) The distribution of the votes per district for President Clinton was bell-shaped. Let [latex]X =[/latex] number of votes for President Clinton for an election district.
- State the approximate distribution of [latex]X[/latex].
- Is 1,956.8 a population mean or a sample mean? How do you know?
- Find the probability that a randomly selected district had fewer than 1,600 votes for President Clinton. Sketch the graph and write the probability statement.
- Find the probability that a randomly selected district had between 1,800 and 2,000 votes for President Clinton.
- Find the third quartile for votes for President Clinton.
Solution
- [latex]X \sim N(1956.8, 572.3)[/latex]
- This is a population mean, because all election districts are included.
- The probability that a district had less than 1,600 votes for President Clinton is 0.2676.
- 0.3798
- Seventy-five percent of the districts had fewer than 2,340 votes for President Clinton.
Suppose that the duration of a particular type of criminal trial is known to be normally distributed with a mean of 21 days and a standard deviation of 7 days.
- In words, define the random variable [latex]X[/latex].
- [latex]X \sim \underline{\hspace{2cm}}( \underline{\hspace{2cm}}, \underline{\hspace{2cm}})[/latex]
- If one of the trials is randomly chosen, find the probability that it lasted at least 24 days. Sketch the graph and write the probability statement.
- Sixty percent of all trials of this type are completed within how many days?
Solution
- [latex]X =[/latex] the distribution of the number of days a particular type of criminal trial will take
- [latex]X \sim N(21, 7)[/latex]
- The probability that a randomly selected trial will last more than 24 days is 0.3336.
- 22.77
Terri Vogel, an amateur motorcycle racer, averages 129.71 seconds per 2.5 mile lap (in a seven-lap race) with a standard deviation of 2.28 seconds. The distribution of her race times is normally distributed. We are interested in one of her randomly selected laps.
- In words, define the random variable [latex]X[/latex].
- [latex]X \sim \underline{\hspace{2cm}}( \underline{\hspace{2cm}}, \underline{\hspace{2cm}})[/latex]
- Find the percent of her laps that are completed in less than 130 seconds.
- The fastest 3% of her laps are under [latex]\underline{\hspace{2cm}}[/latex].
- The middle 80% of her laps are from [latex]\underline{\hspace{2cm}}[/latex] seconds to [latex]\underline{\hspace{2cm}}[/latex] seconds.
Solution
- [latex]X =[/latex] the distribution of race times that Terry Vogel produces
- [latex]X \sim N(129.71, 2.28)[/latex]
- Terri completes 55.17% of her laps in less than 130 seconds.
- Terri completes 55.17% of her laps in less than 130 seconds.
- 124.4 and 135.02
Thuy Dau, Ngoc Bui, Sam Su, and Lan Voung conducted a survey as to how long customers at Lucky claimed to wait in the checkout line until their turn. Let X = time in line. The ordered real data (in minutes) is below.
0.50, 1.75, 2, 2.25, 2.25, 2.5, 2.75, 3.25, 3.75, 3.75, 4.25, 4.25, 4.25, 4.25, 4.5, 4.75, 4.75, 4.75, 5, 5, 5, 5.25, 5.25, 5.5, 5.5, 5.5, 5.75, 5.75, 6, 6, 6, 6, 6.25, 6.25, 6.5, 6.5, 6.5, 6.75, 6.75, 6.75, 7.25, 7.25, 7.25, 7.75, 8, 8.25, 9.5, 9.5, 9.75, 10.75
- Calculate the sample mean and the sample standard deviation.
- Construct a histogram.
- Draw a smooth curve through the midpoints of the tops of the bars.
- In words, describe the shape of your histogram and smooth curve.
- Let the sample mean approximate [latex]\mu[/latex] and the sample standard deviation approximate [latex]\sigma[/latex]. The distribution of [latex]X[/latex] can then be approximated by [latex]X \sim \underline{\hspace{2cm}}( \underline{\hspace{2cm}}, \underline{\hspace{2cm}})[/latex]
- Use the distribution in part e to calculate the probability that a person will wait fewer than 6.1 minutes.
- Determine the cumulative relative frequency for waiting less than 6.1 minutes.
- Why aren’t the answers to part 6 and part 7 exactly the same?
- Why are the answers to part 6 and part 7 as close as they are?
- If only ten customers have been surveyed rather than 50, do you think the answers to part f and part g would have been closer together or farther apart? Explain your conclusion.
Solution
- mean = 5.51, s = 2.15
- Check student's solution.
- Check student's solution.
- Check student's solution.
- [latex]X \sim N(5.51, 2.15)[/latex]
- 0.6029
- The cumulative frequency for less than 6.1 minutes is 0.64.
- The answers to part f and part g are not exactly the same, because the normal distribution is only an approximation to the real one.
- The answers to part f and part g are close, because a normal distribution is an excellent approximation when the sample size is greater than 30.
- The approximation would have been less accurate, because the smaller sample size means that the data does not fit the normal curve as well.
Suppose that Ricardo and Anita attend different colleges. Ricardo’s GPA is the same as the average GPA at his school. Anita’s GPA is 0.70 standard deviations above her school average. In complete sentences, explain why each of the following statements may be false.
- Ricardo’s actual GPA is lower than Anita’s actual GPA.
- Ricardo is not passing because his z-score is zero.
- Anita is in the 70th percentile of students at her college.
Solution
- If the average GPA is less at Anita’s school than it is at Ricardo’s, then Ricardo’s actual score could be higher.
- Passing can be defined differently at different schools. Also, since Ricardo’s z-score is 0, his GPA is actually the average for his school, which is typically a passing GPA.
- Anita’s percentile is higher than the 70th percentile.
The data below shows a sample of the maximum capacity (maximum number of spectators) of sports stadiums. It does not include horse-racing or motor-racing stadiums.
40,000; 40,000; 45,050; 45,500; 46,249; 48,134; 49,133; 50,071; 50,096; 50,466; 50,832; 51,100; 51,500; 51,900; 52,000; 52,132; 52,200; 52,530; 52,692; 53,864; 54,000; 55,000; 55,000; 55,000; 55,000; 55,000; 55,000; 55,082; 57,000; 58,008; 59,680; 60,000; 60,000; 60,492; 60,580; 62,380; 62,872; 64,035; 65,000; 65,050; 65,647; 66,000; 66,161; 67,428; 68,349; 68,976; 69,372; 70,107; 70,585; 71,594; 72,000; 72,922; 73,379; 74,500; 75,025; 76,212; 78,000; 80,000; 80,000; 82,300
- Calculate the sample mean and the sample standard deviation for the maximum capacity of sports stadiums (the data).
- Construct a histogram.
- Draw a smooth curve through the midpoints of the tops of the bars of the histogram.
- In words, describe the shape of your histogram and smooth curve.
- Let the sample mean approximate μ and the sample standard deviation approximate σ. The distribution of X can then be approximated by [latex]X \sim \underline{\hspace{2cm}}( \underline{\hspace{2cm}}, \underline{\hspace{2cm}})[/latex].
- Use the distribution in part e to calculate the probability that the maximum capacity of sports stadiums is less than 67,000 spectators.
- Determine the cumulative relative frequency that the maximum capacity of sports stadiums is less than 67,000 spectators. Hint: Order the data and count the sports stadiums that have a maximum capacity less than 67,000. Divide by the total number of sports stadiums in the sample.
- Why aren’t the answers to part f and part g exactly the same?
Solution
- mean = 60,136; s = 10,468
- Answers will vary.
- Answers will vary.
- Answers will vary.
- [latex]X \sim N(60136, 10468)[/latex]
- 0.7440
- The cumulative relative frequency is [latex]\frac{43}{60} = 0.717[/latex].
- The answers for part f and part g are not the same, because the normal distribution is only an approximation.
An expert witness for a paternity lawsuit testifies that the length of a pregnancy is normally distributed with a mean of 280 days and a standard deviation of 13 days. An alleged father was out of the country from 240 to 306 days before the birth of the child, so the pregnancy would have been less than 240 days or more than 306 days long if he was the father. The birth was uncomplicated, and the child needed no medical intervention.
What is the probability that he was NOT the father? What is the probability that he could be the father? Calculate the z-scores first, and then use those to calculate the probability.
Solution
For [latex]x = 240[/latex], [latex]z =−3.0769[/latex]
For [latex]x = 306[/latex], [latex]z = 2[/latex]
[latex]P(240 \lt x lt 306) = P(–3.0769 \lt z \lt 2) = \text{normalcdf}(–3.0769,2,0,1) = 0.9762[/latex]
According to the scenario given, this means that there is a 97.62% chance that he is not the father.
To answer the second part of the question, there is a [latex]1 – 0.9762 = 0.0238 = 2.38 \%[/latex] chance that he is the father. A NUMMI assembly line, which has been operating since 1984, has built an average of 6,000 cars and trucks a week. Generally, 10% of the cars were defective coming off the assembly line. Suppose we draw a random sample of [latex]n = 100[/latex] cars. Let [latex]X[/latex] represent the number of defective cars in the sample.
What can we say about [latex]X[/latex] in regard to the 68-95-99.7 empirical rule (one standard deviation, two standard deviations and three standard deviations from the mean are being referred to)? Assume a normal distribution for the defective cars in the sample.
Solution
- [latex]n = 100; p = 0.1; q = 0.9[/latex]
- [latex]\mu = np = (100)(0.10) = 10[/latex]
- [latex]\sigma = \sqrt{npq} = \sqrt{(100)(0.1)(0.9)} = 3[/latex]
- [latex]z = \pm 1 : x_1 = \mu + z \sigma = 10 + 1(3) = 13[/latex] and [latex]x_2 = \mu – z\sigma = 10 – 1(3) = 7[/latex]; 68% of the defective cars will fall between seven and 13.
- [latex]z = \pm 2 : x_1 = \mu + z \sigma = 10 + 2(3) = 16[/latex] and [latex]x_2 = \mu – z\sigma = 10 – 2(3) = 4[/latex]; 95 % of the defective cars will fall between four and 16.
- [latex]z = \pm 3 : x_1 = \mu + z \sigma = 10 + 3(3) = 19[/latex] and [latex]x_2 = \mu – z\sigma = 10 – 3(3) = 1[/latex]; 99.7% of the defective cars will fall between one and 19.
We flip a coin 100 times ([latex]n = 100[/latex]) and note that it only comes up heads 20% ([latex]p = 0.20[/latex]) of the time. The mean and standard deviation for the number of times the coin lands on heads is [latex]\mu = 20[/latex] and [latex]\sigma = 4[/latex] (verify the mean and standard deviation). Solve the following:
- There is about a 68% chance that the number of heads will be somewhere between [latex]\underline{\hspace{2cm}}[/latex] and [latex]\underline{\hspace{2cm}}[/latex].
- There is about a [latex]\underline{\hspace{2cm}}[/latex] chance that the number of heads will be somewhere between 12 and 28.
- There is about a [latex]\underline{\hspace{2cm}}[/latex] chance that the number of heads will be somewhere between eight and 32.
Solution
- There is about a 68% chance that the number of heads will be somewhere between 16 and 24. [latex]z = \pm 1 : x_1 = \mu + z \sigma = 20 + 1(4) = 24[/latex] and [latex]x_2 = \mu – z\sigma = 20 – 1(4) = 16[/latex].
- There is about a 95% chance that the number of heads will be somewhere between 12 and 28. For this problem: normalcdf(12,28,20,4) = 0.9545 = 95.45%
- There is about a 99.73% chance that the number of heads will be somewhere between eight and 32. For this problem: normalcdf(8,32,20,4) = 0.9973 = 99.73%.
A $1 scratch-off lotto ticket will be a winner one out of five times. Out of a shipment of [latex]n = 190[/latex] lotto tickets, find the probability for the lotto tickets that there are
- somewhere between 34 and 54 prizes.
- somewhere between 54 and 64 prizes.
- more than 64 prizes.
Solution
- n = 190; p = [latex]n = 190; p = \frac{1}{5} = 0.2; q = 0.8[/latex]
- [latex]\mu = np = (190)(0.2) = 38[/latex]
- [latex]\sigma = \sqrt{npq} = \sqrt{(190)(0.2)(0.8)} = 5.5136[/latex]
- For this problem: [latex]P(34 \lt x \lt 54) = \text{normalcdf}(34,54,48,5.5136) = 0.7641[/latex]
- For this problem: [latex]P(54 \lt x \lt 64) = \text{normalcdf}(54,64,48,5.5136) = 0.0018[/latex]
- For this problem: [latex]P(x > 64) = \text{normalcdf}(64,10^{99},48,5.5136) = 0.0000012[/latex] (approximately 0)
Facebook provides a variety of statistics on its website that detail the growth and popularity of the site.
On average, 28 percent of 18- to 34-year-olds check their Facebook profiles before getting out of bed in the morning. Suppose this percentage follows a normal distribution with a standard deviation of 5 percent.
- Find the probability that the percent of 18- to 34-year-olds who check Facebook before getting out of bed in the morning is at least 30.
- Find the 95th percentile, and express it in a sentence.
Solution
[latex]X =[/latex] the percent of 18- to 34-year-olds who check Facebook before getting out of bed in the morning.
[latex]X \sim N(28, 5)[/latex]
[latex]P(x \ge 30) = 0.3446[/latex]; normalcdf(30,1EE99,28,5) = 0.3446
invNorm(0.95,0.28,0.05) = 0.3622.95% of the percentage of 18- to 34-year-olds who check Facebook before getting out of bed in the morning is at most 36.22%.
[latex]P(25 \lt x \lt 55) = \text{normalcdf}(25,55,28,5) = 0.7257(0.7257)(400) = 290.28[/latex]
References
“Naegele’s rule.” Wikipedia. Available online at http://en.wikipedia.org/wiki/Naegele's_rule (accessed May 14, 2013).
“403: NUMMI.” Chicago Public Media & Ira Glass, 2013. Available online at http://www.thisamericanlife.org/radio-archives/episode/403/nummi (accessed May 14, 2013).
“Scratch-Off Lottery Ticket Playing Tips.” WinAtTheLottery.com, 2013. Available online at http://www.winatthelottery.com/public/department40.cfm (accessed May 14, 2013).
“Smart Phone Users, By The Numbers.” Visual.ly, 2013. Available online at http://visual.ly/smart-phone-users-numbers (accessed May 14, 2013).
“Facebook Statistics.” Statistics Brain. Available online at http://www.statisticbrain.com/facebook-statistics/(accessed May 14, 2013).
Media Attributions
- Private: Figure 5.12 © Significant Statistics by John Morgan Russell is licensed under a CC BY-SA (Attribution ShareAlike) license
- Private: Figure 5.13 © Significant Statistics by John Morgan Russell is licensed under a CC BY-SA (Attribution ShareAlike) license
a continuous random variable (RV) with a mean of 0 and a standard deviation of 1 which z-scores follow: X ~ N(0, 1); when X follows the standard normal distribution, it is often noted as Z ~ N(0, 1).
the linear transformation of the form z = [latex]\frac{x\text{ }–\text{ }\mu }{\sigma }[/latex]; if this transformation is applied to any normal distribution X ~ N(μ, σ) the result is the standard normal distribution Z ~ N(0,1). If this transformation is applied to any specific value x of the RV with mean μ and standard deviation σ, the result is called the z-score of x. The z-score allows us to compare data that are normally distributed but scaled differently.