Chapter 6: The Normal Distribution and The Central Limit Theorem

6.7 Using Technology for the Central Limit Theorem

Learning Objectives

By the end of this section, the student should be able to:

  • use technology to solve problems involving the Central Limit Theorem

This is another section to show technology tools that can be used to help with the statistics concepts taught in this chapter. This section will cover tools to help solve problems involving the Central Limit Theorem.

Graphing Calculator

Similar to all the other technology sections, the main tool introduced will be the TI-83, 83+, 84, or 84+ Graphing Calculator. The graphing calculator can help to find probabilities for the Central Limit Theorem for Sample Means and Sums and find percentiles for the Central Limit Theorem for Sample Means and Sums . You'll notice that the calculator tool is the same for these as was in 6.6 Using Technology for the Normal Distribution. The main differences you need to know are:

The Central Limit Theorem for Sample Means (Averages): [latex]\overline{X} \sim N({\mu }_{x}\text{, }\frac{\sigma_x}{\sqrt{n}})[/latex], where:

  • [latex]\mu_x[/latex] is the average of both X and [latex]\overline{X}[/latex]
  • [latex]\sigma \overline{x} = \frac{\sigma_x}{\sqrt{n}}[/latex] = standard deviation of [latex]\overline{X}[/latex] and is called the standard error of the mean

The Central Limit Theorem for Sums: [latex]\Sigma{X} \sim N \left((n)(\mu_x), (\sqrt{n})(\sigma_X) \right)[/latex], where:

  • [latex](n)(\mu_X) = \text{the mean of } \Sigma{X}[/latex]
  • [latex](\sqrt{n})({\sigma }_{X}) = \text{standard deviation of }\Sigma X[/latex]

 

Note

It is important for you to understand when to use the central limit theorem and when to differentiate between the CLT for the Mean versus the CLT for the Sum (or Total). If you are being asked to find the probability of the mean, use the clt for the mean, but if you are being asked to find the probability of a sum or total, use the clt for sums. This also applies to percentiles for means and sums.

Using the TI-83, 83+, 84, 84+ Calculator

To find probabilities for means on the calculator, follow these steps.

Access [DISTR] by pressing 2nd key, vars key. (Note: DISTR stands for "Distributions")

2: normalcdf()

[latex]\text{normalcdf(lower value of the area, upper value of the area, mean, }\frac{\text{standard deviation}}{\sqrt{\text{sample size}}})[/latex]

where:

  • mean is the mean of the original distribution
  • standard deviation is the standard deviation of the original distribution
  • sample size = n

Let's see some examples.

Example

An unknown distribution has a mean of 90 and a standard deviation of 15. Samples of size [latex]n = 25[/latex] are drawn randomly from the population. Find the probability that the sample mean is between 85 and 92.

Solution

Let [latex]X =[/latex] one value from the original unknown population. The probability question asks you to find a probability for the sample mean.

Let [latex]\overline{X}[/latex] = the mean of a sample of size 25. Since [latex]\mu_X = 90[/latex], [latex]\sigma_X= 15[/latex], and [latex]n = 25[/latex], [latex]\overline{X} \sim N \left(90, \frac{15}{\sqrt{25}} \right)[/latex].

Find [latex]P(85 \lt \overline{x} \lt 92)[/latex]. Draw a graph.

A normal distribution curve with the peak at 90 and points 85 and 92 have the area between them shaded.
Figure 1. P(85< x-bar < 92) is the shaded region

[latex]\text{normalcdf}(\text{lower value, upper value, mean, standard error of the mean})[/latex]

The parameter list is abbreviated (lower value, upper value, [latex]\mu[/latex], [latex]\frac{\sigma }{\sqrt{n}}[/latex])

[latex]\text{normalcdf} \left(85, 92, 90, \frac{15}{\sqrt{25}} \right) = 0.6997[/latex]

Your Turn!

An unknown distribution has a mean of 45 and a standard deviation of eight. Samples of size [latex]n = 30[/latex] are drawn randomly from the population. Find the probability that the sample mean is between 42 and 50.

Solution

[latex]P(42 \lt \overline{x} \lt 50) = \text{normalcdf}(42, 50, 45, \frac{8}{\sqrt{30}}\right) = 0.9797[/latex]

Example

The length of time, in hours, it takes an "over 40" group of people to play one soccer match is normally distributed with a mean of two hours and a standard deviation of 0.5 hours. A sample of size [latex]n = 50[/latex] is drawn randomly from the population. Find the probability that the sample mean is between 1.8 hours and 2.3 hours.

Solution

Let X = the time, in hours, it takes to play one soccer match.

The probability question asks you to find a probability for the sample mean time, in hours, it takes to play one soccer match.

Let [latex]\overline{X}[/latex] = the mean time, in hours, it takes to play one soccer match.

If [latex]\mu_X = \underline{\hspace{2cm}}[/latex], [latex]\sigma_X = \underline{\hspace{2cm}}[/latex], and [latex]n = \underline{\hspace{2cm}}[/latex], then [latex]X \sim N(\underline{\hspace{2cm}}[/latex],[latex]\underline{\hspace{2cm}})[/latex] by the central limit theorem for means.

[latex]\mu_X = 2[/latex], [latex]\sigma_X = 0.5[/latex], [latex]n = 50[/latex], and [latex]X \sim N(2, \frac{0.5}{\sqrt{50}})[/latex]

Find [latex]P(1.8 \lt \overline{x} \lt 2.3)[/latex]. Draw a graph.

[latex]P(1.8 \lt \overline{x} \lt 2.3) = 0.9977[/latex]

[latex]\text{normalcdf} \left(1.8, 2.3, 2, \frac{0.5}{\sqrt{50}} \right) = 0.9977[/latex]

The probability that the mean time is between 1.8 hours and 2.3 hours is 0.9977.

Your Turn!

The length of time taken on the SAT for a group of students is normally distributed with a mean of 2.5 hours and a standard deviation of 0.25 hours. A sample size of [latex]n = 60[/latex] is drawn randomly from the population. Find the probability that the sample mean is between two hours and three hours.

Solution

[latex]P(2 \lt \overline{x} \lt 3)= \text{normalcdf}\left(2, 3, 2.5, \frac{0.25}{\sqrt{60}}\right) = 1[/latex]

Using the TI-83, 83+, 84, 84+ Calculator

To find percentiles for means on the calculator, follow these steps.

Access [DISTR] by pressing 2nd key,vars key. (Note: DISTR stands for "Distributions")

3: invNorm()

[latex]k = \text{invNorm} \left( \text{area to the left of } k \text{, mean, } \frac{\text{standard deviation}}{\sqrt{\text{sample size}}} \right)[/latex]

where:

  • k = the kth percentile
  • mean is the mean of the original distribution
  • standard deviation is the standard deviation of the original distribution
  • sample size = n

Example

In a recent study reported Oct. 29, 2012 on the Flurry Blog, the mean age of tablet users is 34 years. Suppose the standard deviation is 15 years. Take a sample of size [latex]n = 100[/latex].

  1. What are the mean and standard deviation for the sample mean ages of tablet users?
  2. What does the distribution look like?
  3. Find the probability that the sample mean age is more than 30 years (the reported mean age of tablet users in this particular study).
  4. Find the 95th percentile for the sample mean age (to one decimal place).

Solution

  1. Since the sample mean tends to target the population mean, we have [latex]\mu_X = \mu = 34[/latex]. The sample standard deviation is given by [latex]\sigma_x = \frac{\sigma }{\sqrt{n}} = \frac{15}{\sqrt{100}} = \frac{15}{10} = 1.5[/latex].
  2. The central limit theorem states that for large sample sizes(n), the sampling distribution will be approximately normal.
  3. The probability that the sample mean age is more than 30 is given by [latex]P(Χ > 30) = \text{normalcdf}(30, E99, 34,1.5) = 0.9962[/latex]
  4. Let k = the 95th percentile. [latex]k = \text{invNorm}\left(0.95, 34, \frac{15}{\sqrt{100}}\right) = 36.5[/latex]

Your Turn!

In an article on Flurry Blog, a gaming marketing gap for men between the ages of 30 and 40 is identified. You are researching a startup game targeted at the 35-year-old demographic. Your idea is to develop a strategy game that can be played by men from their late 20s through their late 30s. Based on the article’s data, industry research shows that the average strategy player is 28 years old with a standard deviation of 4.8 years. You take a sample of 100 randomly selected gamers. If your target market is 29- to 35-year-olds, should you continue with your development strategy?

Solution

You need to determine the probability for men whose mean age is between 29 and 35 years of age wanting to play a strategy game.

[latex]P(29 \lt \overline{x} \lt 35) = \text{normalcdf} \left(29, 35, 28, \frac{4.8}{\sqrt{100}}\right) = 0.0186[/latex]

You can conclude there is approximately a 1.9% chance that your game will be played by men whose mean age is between 29 and 35.

Example

The mean number of minutes for app engagement by a tablet user is 8.2 minutes. Suppose the standard deviation is one minute. Take a sample of 60.

  1. What are the mean and standard deviation for the sample mean number of app engagement by a tablet user?
  2. What is the standard error of the mean?
  3. Find the 90th percentile for the sample mean time for app engagement for a tablet user. Interpret this value in a complete sentence.
  4. Find the probability that the sample mean is between eight minutes and 8.5 minutes.

Solution

  1. [latex]{\mu }_{\overline{x}}=\mu =8.2{\sigma }_{\overline{x}}=\frac{\sigma }{\sqrt{n}}=\frac{1}{\sqrt{60}}=0.13[/latex]
  2. This allows us to calculate the probability of sample means of a particular distance from the mean, in repeated samples of size 60.
  3. Let k = the 90th percentile. [latex]k = \text{invNorm}\left(0.90, 8.2, \frac{1}{\sqrt{60}}\right) = 8.37[/latex]. This value indicates that 90 percent of the average app engagement time for table users is less than 8.37 minutes.
  4. [latex]P(8 \lt \overline{x} \lt 8.5= \text{normalcdf} \left(8, 8.5, 8.2, \frac{1}{\sqrt{60}}\right) = 0.9293[/latex]

Your Turn!

Cans of a cola beverage claim to contain 16 ounces. The amounts in a sample are measured and the statistics are [latex]n = 34[/latex], [latex]\overline{x} = 16.01 \text{ounces}[/latex]. If the cans are filled so that [latex]\mu = 16.00 \text{ounces}[/latex] (as labeled) and [latex]\sigma = 0.143 \text{ounces}[/latex], find the probability that a sample of 34 cans will have an average amount greater than 16.01 ounces. Do the results suggest that cans are filled with an amount greater than 16 ounces?

Solution

We have [latex]P(\overline{x} > 16.01) = \text{normalcdf} \left(16.01, E99, 16, \frac{0.143}{\sqrt{34}}\right) = 0.3417[/latex]. Since there is a 34.17% probability that the average sample weight is greater than 16.01 ounces, we should be skeptical of the company’s claimed volume. If I am a consumer, I should be glad that I am probably receiving free cola. If I am the manufacturer, I need to determine if my bottling processes are outside of acceptable limits.

Example

Suppose that a market research analyst for a cell phone company conducts a study of their customers who exceed the time allowance included on their basic cell phone contract; the analyst finds that for those people who exceed the time included in their basic contract, the excess time used follows an exponential distribution with a mean of 22 minutes.

Consider a random sample of 80 customers who exceed the time allowance included in their basic cell phone contract.

Let [latex]X =[/latex] the excess time used by one INDIVIDUAL cell phone customer who exceeds his contracted time allowance.

[latex]X \sim \text{Exp}\left(\frac{1}{22}\right)[/latex]. From previous chapters, we know that [latex]\mu = 22[/latex] and [latex]\sigma =  22[/latex].

Let [latex]\overline{X}[/latex] = the mean excess time used by a sample of [latex]n = 80[/latex] customers who exceed their contracted time allowance.

[latex]\overline{X} \sim N \left(22, \frac{22}{\sqrt{80}}\right)[/latex] by the central limit theorem for sample means

  1. Find the probability that the mean excess time used by the 80 customers in the sample is longer than 20 minutes. This is asking us to find [latex]P(\overline{x} > 20)[/latex]. Draw the graph.
  2. Suppose that one customer who exceeds the time limit for his cell phone contract is randomly selected. Find the probability that this individual customer's excess time is longer than 20 minutes. This is asking us to find [latex]P(x > 20)[/latex].
  3. Explain why the probabilities in parts a and b are different.
  4. Find the 95th percentile for the sample mean excess time for samples of 80 customers who exceed their basic contract time allowances. Draw a graph.

Solution

  1. Find [latex]P(\overline{x} > 20)[/latex]. [latex]P(\overline{x} > 20) = 0.79199[/latex] using [latex]\text{normalcdf}\left(20, 1E99, 22, \frac{22}{\sqrt{80}}\right)[/latex]. The probability is 0.7919 that the mean excess time used is more than 20 minutes, for a sample of 80 customers who exceed their contracted time allowance. Reminder: [latex]1\text{E}99 = 10^{99}[/latex] and [latex]-1\text{E}99 = -10^{99}[/latex]. Press the [latex]\text{EE}[/latex] key for E. Or just use [latex]10^{99}[/latex] instead of [latex]1\text{E}99[/latex].

  2. Find [latex]P(x > 20)[/latex]. Remember to use the exponential distribution for an individual: [latex]X~Exp\left(\frac{1}{22}\right)[/latex].
  3. [latex]P(x > 20) = {e}^{\left(\frac{-1}{22}\right)(20)}[/latex] or [latex]e^{(-0.04545)(20)} = 0.4029[/latex]
    • [latex]P(x > 20) = 0.4029[/latex] but [latex]P(\overline{x} > 20) = 0.7919[/latex]
    • The probabilities are not equal because we use different distributions to calculate the probability for individuals and for means.
    • When asked to find the probability of an individual value, use the stated distribution of its random variable; do not use the clt. Use the clt with the normal distribution when you are being asked to find the probability for a mean.
  4. Let k = the 95th percentile. Find k where [latex]P(\overline{x} \lt k) = 0.95[/latex]. [latex]k = 26.0[/latex] using [latex]\text{invNorm}(0.95, 22, \frac{22}{\sqrt{80}}) = 26.0[/latex]. The 95th percentile for the sample mean excess time used is about 26.0 minutes for random samples of 80 customers who exceed their contractual allowed time. Ninety five percent of such samples would have means under 26 minutes; only five percent of such samples would have means above 26 minutes.
A normal curve with a peak at 22. The value 20 has a shaded region to the right of it, representing P(x-bar > 20)
Figure 2. Graph of [latex]P(\overline{x} > 20)[/latex]

 

A normal distribution curve with peak at 22 and a point, k, to the right of 22. The area under the curve to the left of k is shaded showing P(x-bar < k) = 0.95
Figure 3.  Graph of 95th percentile

Example

Based on data from the National Health Survey, women between the ages of 18 and 24 have an average systolic blood pressures (in mmHg) of 114.8 with a standard deviation of 13.1. Systolic blood pressure for women between the ages of 18 to 24 follows a normal distribution.

  1. If one woman from this population is randomly selected, find the probability that her systolic blood pressure is greater than 120.
  2. If 40 women from this population are randomly selected, find the probability that their mean systolic blood pressure is greater than 120.
  3. If the sample were four women between the ages of 18 to 24 and we did not know the original distribution, could the central limit theorem be used?

Solution

  1. [latex]P(x > 120) = \text{normalcdf}(120, 99999, 114.8, 13.1) = 0.3457[/latex]. There is about a 35% that the randomly selected woman will have systolic blood pressure greater than 120.
  2. [latex]P(\overline{x} > 120) = \text{normalcdf} \left(120, 114.8, \frac{13.1}{\sqrt{40}} \right) = 0.006[/latex]. There is only a 0.6% chance that the average systolic blood pressure for the randomly selected group is greater than 120.
  3. The central limit theorem could not be used if the sample size were four and we did not know the original distribution was normal. The sample size would be too small.

Your Turn!

According to Boeing data, the 757 airliner carries 200 passengers and has doors with a mean height of 72 inches. Assume for a certain population of men we have a mean of 69.0 inches and a standard deviation of 2.8 inches.

  1. What mean doorway height would allow 95% of men to enter the aircraft without bending?
  2. Assume that half of the 200 passengers are men. What mean doorway height satisfies the condition that there is a 0.95 probability that this height is greater than the mean height of 100 men?
  3. For engineers designing the 757, which result is more relevant: the height from part a or part 2? Why?

Solution

  1. We know that [latex]\mu_{X} = \mu = 69[/latex] and we have [latex]\sigma_{X} = 2.8[/latex]. The height of the doorway is found to be [latex]\text{invNorm}(0.95, 69, 2.8) = 73.61[/latex]
  2. We know that [latex]\mu_{X} = \mu = 69[/latex] and we have [latex]\sigma_{X} = 0.28[/latex]. So, [latex]\text{invNorm}(0.95, 69, 0.28) = 69.49[/latex].
  3. When designing the doorway heights, we need to incorporate as much variability as possible in order to accommodate as many passengers as possible. Therefore, we need to use the result based on part 1.

Using the TI-83, 83+, 84, 84+ Calculator

To find probabilities for sums on the calculator, follow these steps.

Access [DISTR] by pressing 2nd key,vars key. (Note: DISTR stands for "Distributions")

2: normalcdf()

[latex]\text{normalcdf} \left(\text{lower value of the area, upper value of the area, }(n)(\text{mean}), (\sqrt{n})(\text{standard deviation}) \right)[/latex]

where:

  • mean is the mean of the original distribution
  • standard deviation is the standard deviation of the original distribution
  • sample size = n

Example

An unknown distribution has a mean of 90 and a standard deviation of 15. A sample of size 80 is drawn randomly from the population.

  1. Find the probability that the sum of the 80 values (or the total of the 80 values) is more than 7,500.
  2. Find the sum that is 1.5 standard deviations above the mean of the sums.

Solution

Let [latex]X =[/latex] one value from the original unknown population. The probability question asks you to find a probability for the sum (or total of) 80 values.

[latex]\Sigma{X} = \text{the sum or total of 80 values}[/latex]. Since [latex]\mu_X = 90[/latex], [latex]\sigma_X = 15[/latex], and [latex]n = 80[/latex], [latex]\Sigma X \sim N \left((80)(90), (\sqrt{80})(15) \right)[/latex]

  • [latex]\text{mean of the sums} = (n)(\mu_X) = (80)(90) = 7,200[/latex]
  • [latex]\text{standard deviation of the sums} = (\sqrt{n})(\sigma _{X}) = (\sqrt{80})(15)[/latex]
  • [latex]\text{sum of 80 values} = \Sigma{X} = 7,500[/latex]
  1. Find [latex]P(\Sigma{x} > 7,500)[/latex]. [latex]P(\Sigma{x} > 7,500) = 0.0127[/latex]. [latex]\text{normalcdf(lower value, upper value, mean of sums, stdev of sums)}[/latex]. The parameter list is abbreviated [latex]\text{lower, upper, }(n)(\mu_X, (\sqrt{n})(\sigma_X)[/latex].[latex]\text{normalcdf} \left(7500, 1\text{E}99, (80)(90), (\sqrt{80})(15) \right) = 0.0127[/latex].
  2. Find [latex]\Sigma{x}[/latex] where [latex]z = 1.5[/latex]. [latex]\Sigma{x} = (n)(\mu_x)+(z)(\sqrt{n})(\sigma_x) = (80)(90) + (1.5)(\sqrt{80})(15) = 7,401.2[/latex]
A normal distribution curve with the at 7200 and 7500 to the right. The area to the right of 7500 below the curve is shaded.
Figure 4. The shaded area represents the probability of the sum being greater than 7500

Your Turn!

An unknown distribution has a mean of 45 and a standard deviation of eight. A sample size of 50 is drawn randomly from the population. Find the probability that the sum of the 50 values is more than 2,400.

Solution

0.0040

Using the TI-83, 83+, 84, 84+ Calculator

To find percentiles for sums on the calculator, follow these steps.

Access [DISTR] by pressing 2nd key,vars key. (Note: DISTR stands for "Distributions")

3:invNorm()

[latex]k = \text{invNorm} \left(\text{area to the left of }k, (n)(\text{mean}), (\sqrt{n})(\text{standard deviation}) \right)[/latex]

where:

  • k is the kth percentile
  • mean is the mean of the original distribution
  • standard deviation is the standard deviation of the original distribution
  • sample size = n

Example

In an October 29, 2012, study reported on the Flurry Blog, the mean age of tablet users is 34 years. Suppose the standard deviation is 15 years. The sample size is 50.

  1. What are the mean and standard deviation for the sum of the ages of tablet users? What is the distribution?
  2. Find the probability that the sum of the ages is between 1,500 and 1,800 years.
  3. Find the 80th percentile for the sum of the 50 ages.

Solution

  1. [latex]\mu_{\Sigma{x}} = n \mu_x = 50(34) = 1,700[/latex] and [latex]\sigma_{\Sigma{x}} = (\sqrt{n}) (\sigma_x) = (\sqrt{50})(15) = 106.01[/latex]
    The distribution is normal for sums by the central limit theorem.
  2. [latex]P(1500 \lt \Sigma{x} \lt 1800) = \text{normalcdf} \left(1500, 1800, (50)(34), (\sqrt{50})(15) \right) = 0.7974[/latex]
  3. Let [latex]k =[/latex] the 80th percentile.
    [latex]k = \text{invNorm} \left(0.80, (50)(34), (\sqrt{50})(15) \right) = 1,789.3[/latex]

Your Turn!

In an October 29, 2012, study reported on the Flurry Blog, the mean age of tablet users is 35 years. Suppose the standard deviation is ten years. The sample size is 39.

  1. What are the mean and standard deviation for the sum of the ages of tablet users? What is the distribution?
  2. Find the probability that the sum of the ages is between 1,400 and 1,500 years.
  3. Find the 90th percentile for the sum of the 39 ages.

Solution

  1. [latex]\mu_{\Sigma{x}} = n \mu_x  = 1,365[/latex] and [latex]\sigma_{\Sigma{x}} = (\sqrt{n}) (\sigma_x) = 62.4[/latex]
    The distribution is normal for sums by the central limit theorem.
  2. [latex]P(1400 \lt \Sigma{x} \lt 1500) = \text{normalcdf}(1400, 1500, (39)(35), (\sqrt{39})(10)) = 0.2723[/latex]
  3. Let k = the 90th percentile.
    [latex]k = \text{invNorm} \left(0.90, (39)(35), (\sqrt{39})(10) \right) = 1,445.0[/latex]

Example

The mean number of minutes for app engagement by a tablet user is 8.2 minutes. Suppose the standard deviation is one minute. Take a sample of size 70.

  1. What are the mean and standard deviation for the sums?
  2. Find the 95th percentile for the sum of the sample. Interpret this value in a complete sentence.
  3. Find the probability that the sum of the sample is at least ten hours.

Solution

  1. [latex]\mu_{\Sigma{x}} = n \mu_x  = 70(8.2) = 574 \text{ minutes}[/latex] and [latex]\sigma_{\Sigma{x}} = (\sqrt{n}) (\sigma_x) = (\sqrt{70})(1) = 8.37 \text{minutes}[/latex]
  2. Let k = the 95th percentile.
    [latex]k = \text{invNorm}(0.95, (70)(8.2), (\sqrt{70})(1)) = 587.76 \text{minutes}[/latex]
    Ninety five percent of the app engagement times are at most 587.76 minutes.
  3. ten hours = 600 minutes
    [latex]P(\Sigma{x} \ge 600) = \text{normalcdf} \left(600, \text{E}99, (70)(8.2), (\sqrt{70})(1) \right) = 0.0009[/latex]

Your Turn!

The mean number of minutes for app engagement by a tablet user is 8.2 minutes. Suppose the standard deviation is one minute. Take a sample size of 70.

  1. What is the probability that the sum of the sample is between seven hours and ten hours? What does this mean in context of the problem?
  2. Find the 84th and 16th percentiles for the sum of the sample. Interpret these values in context.

Solution

  1. 7 hours = 420 minutes and 10 hours = 600 minutes
    [latex]P\left(420\le \Sigma x\le 600\right) = \text{normalcdf} \left(420, 600, (70)(8.2), \sqrt{70}(1) \right)=0.9991[/latex]This means that for these sample sums, there is a 99.9% chance that the sums of usage minutes will be between 420 minutes and 600 minutes.
  2. [latex]\text{invNorm}\left(0.84,\left(70\right)\left(8.2\right),\sqrt{70}\left(1\right)\right)=582.32[/latex]. [latex]\text{invNorm}\left(0.16,\left(70\right)\left(8.2\right),\sqrt{70}\left(1\right)\right)=565.68[/latex]. Since 84% of the app engagement times are at most 582.32 minutes and 16% of the app engagement times are at most 565.68 minutes, we may state that 68% of the app engagement times are between 565.68 minutes and 582.32 minutes.

Example

A study involving stress is conducted among the students on a college campus. The stress scores follow a uniform distribution with the lowest stress score equal to one and the highest equal to five. Using a sample of 75 students. Find:

  1. The probability that the mean stress score for the 75 students is less than two.
  2. The 90th percentile for the mean stress score for the 75 students.
  3. The probability that the total of the 75 stress scores is less than 200.
  4. The 90th percentile for the total stress score for the 75 students.

Solution

Let X = one stress score. Problems 1 and 2 ask you to find a probability or a percentile for a mean. Problems 3 and 4 ask you to find a probability or a percentile for a total or sum.

The sample size, n, is equal to 75. Since the individual stress scores follow a uniform distribution, [latex]X \sim U(1, 5)[/latex] where [latex]a = 1[/latex] and [latex]b = 5[/latex].

[latex]\mu_X = \frac{a+b}{2} = \frac{1 + 5}{2} = 3[/latex]

[latex]\sigma_X = \sqrt{\frac{{(b–a)}^{2}}{12}} = \sqrt{\frac{{(5-1)}^2}{12}} = 1.15[/latex]

For problems 1 and 2., let [latex]\overline{X} = \text{the mean stress score for the 75 students}[/latex]. Then, [latex]\overline{X} \sim N \left(3, \frac{1.15}{\sqrt{75}}\right)[/latex] where [latex]n = 75[/latex].

For problems 3 and 4, let [latex]\Sigma{X} = \text{the sum of the 75 stress scores}[/latex]. Then, [latex]\Sigma{X} \sim N \left[(75)(3), (\sqrt{75})(1.15) \right][/latex]

  1. Find [latex]P(\overline{x} \lt 2)[/latex]. [latex]P(\overline{x} \lt 2) = 0[/latex]. The probability that the mean stress score is less than two is about zero. [latex]\text{normalcdf}\left(1, 2, 3, \frac{1.15}{\sqrt{75}}\right) = 0[/latex]. Reminder: The smallest stress score is one.
  2. Let [latex]k =[/latex] the 90th percentile. Find [latex]k[/latex], where [latex]P(\overline{x} \lt k) = 0.90[/latex]. [latex]k = 3.2[/latex]. This is a normal distribution curve. The peak of the curve coincides with the point 3 on the horizontal axis. A point, k, is labeled to the right of 3. A vertical line extends from k to the curve. The area under the curve to the left of k is shaded. The shaded area shows that [latex]P(\overline{x} \lt k) = 0.90[/latex]. The 90th percentile for the mean of 75 scores is about 3.2. This tells us that 90% of all the means of 75 stress scores are at most 3.2, and that 10% are at least 3.2. [latex]\text{invNorm}\left(0.90, 3, \frac{1.15}{\sqrt{75}}\right) = 3.2[/latex].
  3. Find [latex]P(\Sigma{x} \lt 200)[/latex]. Draw the graph. The mean of the sum of 75 stress scores is [latex](75)(3) = 225[/latex]. The standard deviation of the sum of 75 stress scores is [latex](\sqrt{75})(1.15) = 9.96[/latex]. [latex]\text{normalcdf}(75, 200, (75)(3), (\sqrt{75})(1.15) = 0.006[/latex]. The probability that the total of 75 scores is less than 200 is about zero. Reminder: The smallest total of 75 stress scores is 75, because the smallest single score is one.
  4. Find [latex]k[/latex] where [latex]P(\Sigma{x} \lt k) = 0.90[/latex]. [latex]k = 237.8[/latex]. This is a normal distribution curve. The peak of the curve coincides with the point 225 on the horizontal axis. A point, k, is labeled to the right of 225. A vertical line extends from k to the curve. The area under the curve to the left of k is shaded. The shaded area shows that [latex]P(\Sigma{x} \lt k) = 0.90[/latex]. The 90th percentile for the sum of 75 scores is about 237.8. This tells us that 90% of all the sums of 75 scores are no more than 237.8 and 10% are no less than 237.8. [latex]\text{invNorm}(0.90, (75)(3), (\sqrt{75})(1.15)) = 237.8[/latex]
A normal distribution curve with the at 3. A point, 2, is marked at the left edge of the curve.
Figure 5. Curve for example to find P(x-bar < 2) = 0
A normal distribution curve with the peak at 3 and k labeled to the right of 3. The area under the curve to the left of k is shaded.
Figure 6. The shaded area shows that P(x-bar < k) = 0.90
A normal distribution curve with the peak at 225. A point, 200, is marked at the left edge of the curve.
Figure 7. Curve to represent P(Sum of x < 200) = 0

 

A normal distribution curve with a peak at 225 and k to the right of 225. The area under the curve to the left of k is shaded.
Figure 8. The shaded area shows that P(sum of x < k) = 0.90

Example

In the United States, someone is sexually assaulted every two minutes, on average, according to a number of studies. Suppose the standard deviation is 0.5 minutes and the sample size is 100.

  1. Find the median, the first quartile, and the third quartile for the sample mean time of sexual assaults in the United States.
  2. Find the median, the first quartile, and the third quartile for the sum of sample times of sexual assaults in the United States.
  3. Find the probability that a sexual assault occurs on the average between 1.75 and 1.85 minutes.
  4. Find the value that is two standard deviations above the sample mean.
  5. Find the IQR for the sum of the sample times.

Solution

  1. We have [latex]\mu_x = \mu = 2[/latex] and [latex]\sigma_x = \frac{\sigma }{\sqrt{n}} = \frac{0.5}{10} = 0.05[/latex]. Therefore:
    • [latex]50^{\text{th}} \text{ percentile} = \mu_x = \mu = 2[/latex]
    • [latex]25^{\text{th}} \text{ percentile} = \text{invNorm}(0.25, 2, 0.05) = 1.97[/latex]
    • [latex]75^{\text{th}} \text{ percentile} = \text{invNorm}(0.75, 2, 0.05) = 2.03[/latex]
  2. We have [latex]\mu_{\Sigma{X}} = n(\mu_x) = 100(2) = 200[/latex] and [latex]\sigma_{\mu_{x}} = (\sqrt{n})(\sigma_x) = (10)(0.5) = 5[/latex]. Therefore:
    • [latex]50^{\text{th}} \text{ percentile} = \mu_{\Sigma{X}} = n(\mu_x) = 100(2) = 200[/latex]
    • [latex]25^{\text{th}} \text{ percentile} = \text{invNorm}(0.25, 200, 5) = 196.63[/latex]
    • [latex]75^{\text{th}} \text{ percentile} = \text{invNorm}(0.75, 200, 5) = 203.37[/latex]
  3. [latex]P( 1.75 \lt \overline{x} \lt 1.85) = \text{normalcdf}(1.75, 1.85, 2, 0.05) = 0.0013[/latex]
  4. Using the z-score equation, [latex]z = \frac{\overline{x} - {\mu }_{\overline{x}}}{{\sigma }_{\overline{x}}}[/latex], and solving for [latex]x[/latex], we have [latex]x = 2(0.05) + 2 = 2.1[/latex].
  5. The IQR is [latex]75^{\text{th}} \text{ percentile} - 25^{\text{th}} \text{ percentile} = 203.37 – 196.63 = 6.74[/latex]

Your Turn!

A study was done about violence against prostitutes and the symptoms of the posttraumatic stress that they developed. The age range of the prostitutes was 14 to 61. The mean age was 30.9 years with a standard deviation of nine years.

  1. In a sample of 25 prostitutes, what is the probability that the mean age of the prostitutes is less than 35?
  2. Is it likely that the mean age of the sample group could be more than 50 years? Interpret the results.
  3. In a sample of 49 prostitutes, what is the probability that the sum of the ages is no less than 1,600?
  4. Is it likely that the sum of the ages of the 49 prostitutes is at most 1,595? Interpret the results.
  5. Find the 95th percentile for the sample mean age of 65 prostitutes. Interpret the results.
  6. Find the 90th percentile for the sum of the ages of 65 prostitutes. Interpret the results.

Solution

  1. [latex]P(\overline{x} \lt 35) = \text{normalcdf}(-1 \text{E}99, 35, 30.9, 1.8) = 0.9886[/latex]
  2. [latex]P(\overline{x} > 50) = \text{normalcdf}(50, 1\text{E}99, 30.9, 1.8) \approx 0[/latex]. For this sample group, it is almost impossible for the group’s average age to be more than 50. However, it is still possible for an individual in this group to have an age greater than 50.
  3. [latex]P(\Sigma{x} \ge 1,600) = \text{normalcdf}(1600, 1 \text{E}99, 1514.10, 63) = 0.0864[/latex]
  4. [latex]P(\Sigma{x} \le 1,595) = \text{normalcdf}(-1 \text{E}99, 1595, 1514.10, 63) = 0.9005[/latex]. This means that there is a 90% chance that the sum of the ages for the sample group [latex]n = 49[/latex] is at most 1595.
  5. The [latex]95^{\text{th}} \text{ percentile} = \text{invNorm}(0.95, 30.9, 1.1) = 32.7[/latex]. This indicates that 95% of the prostitutes in the sample of 65 are younger than 32.7 years, on average.
  6. The [latex]90^{\text{th}} \text{ percentile} = \text{invNorm}(0.90, 2008.5, 72.56) = 2101.5[/latex]. This indicates that 90% of the prostitutes in the sample of 65 have a sum of ages less than 2,101.5 years.

 

Helpful Videos for the Graphing Calculator

Below are links to helpful videos for using the graphing calculator for the concepts covered on this page:

 

Additional Technology Tools

In addition to the graphing calculator, there are some additional technology tools that can be used for the concepts covered on this page. Below are links to helpful videos for those tools:

 

Section Practice

Suppose that a category of world-class runners are known to run a marathon (26 miles) in an average of 145 minutes with a standard deviation of 14 minutes. Consider 49 of the races. Let [latex]\overline{X}[/latex] the average of the 49 races.

  1. [latex]\overline{X} \sim \underline{\hspace{2cm}}(\underline{\hspace{2cm}}, \underline{\hspace{2cm}})[/latex]
  2. Find the probability that the runner will average between 142 and 146 minutes in these 49 marathons.
  3. Find the 80th percentile for the average of these 49 marathons.
  4. Find the median of the average running times.
Solution
  1. [latex]N(145, 2)[/latex]
  2. 0.6247
  3. 146.68
  4. 145 minutes

The length of songs in a collector’s iTunes album collection is uniformly distributed from two to 3.5 minutes. Suppose we randomly pick five albums from the collection. There are a total of 43 songs on the five albums.

  1. In words, [latex]Χ = \underline{\hspace{2cm}}[/latex]
  2. [latex]Χ \sim \underline{\hspace{2cm}}[/latex]
  3. In words, [latex]\overline{X} = \underline{\hspace{2cm}}[/latex]
  4. [latex]\overline{X} \sim \underline{\hspace{2cm}}(\underline{\hspace{2cm}}, \underline{\hspace{2cm}})[/latex]
  5. Find the first quartile for the average song length.
  6. The IQR(interquartile range) for the average song length is from [latex]\underline{\hspace{2cm}}[/latex].
Solution
  1. the length of a song, in minutes, in the collection
  2. [latex]U(2, 3.5)[/latex]
  3. the average length, in minutes, of the songs from a sample of five albums from the collection
  4. [latex]N(2.75, 0.0220)[/latex]
  5. 2.74 minutes
  6. 0.03 minutes

The distribution of results from a cholesterol test has a mean of 180 and a standard deviation of 20. A sample size of 40 is drawn randomly.

  1. Find the probability that the sum of the 40 values is greater than 7,500.
  2. Find the probability that the sum of the 40 values is less than 7,000.
  3. Find the sum that is one standard deviation above the mean of the sums.
  4. Find the sum that is 1.5 standard deviations below the mean of the sums.
  5. Find the percentage of sums between 1.5 standard deviations below the mean of the sums and one standard deviation above the mean of the sums.
Solution
  1. 0.0089
  2. 0.0569
  3. 7,326.49
  4. 7,010.26
  5. 77.45%

Salaries for teachers in a particular elementary school district are normally distributed with a mean of $44,000 and a standard deviation of $6,500. We randomly survey ten teachers from that district.

  1. In words, [latex]X = \underline{\hspace{2cm}}[/latex]
  2. [latex]X \sim \underline{\hspace{2cm}}(\underline{\hspace{2cm}}, \underline{\hspace{2cm}})[/latex]
  3. In words, [latex]\Sigma{X} = \underline{\hspace{2cm}}[/latex]
  4. [latex]\Sigma{X} \sim \underline{\hspace{2cm}}(\underline{\hspace{2cm}}, \underline{\hspace{2cm}})[/latex]
  5. Find the probability that the teachers earn a total of over $400,000.
  6. Find the 90th percentile for an individual teacher's salary.
  7. Find the 90th percentile for the sum of ten teachers' salary.
  8. If we surveyed 70 teachers instead of ten, graphically, how would that change the distribution in part d?
  9. If each of the 70 teachers received a $3,000 raise, graphically, how would that change the distribution in part b?
Solution
  1. [latex]X =[/latex] the salary of one elementary school teacher in the district
  2. [latex]X \sim N(44,000, 6,500)[/latex]
  3. [latex]\Sigma{X} \text{sum of the salaries of ten elementary school teachers in the sample}[/latex]
  4. [latex]\Sigma{X} \sim N(44000, 20554.80)[/latex]
  5. 0.9742
  6. $52,330.09
  7. 466,342.04
  8. Sampling 70 teachers instead of ten would cause the distribution to be more spread out. It would be a more symmetrical normal curve.
  9. If every teacher received a $3,000 raise, the distribution of X would shift to the right by $3,000. In other words, it would have a mean of $47,000.

Media Attributions

definition

License

Icon for the Creative Commons Attribution-ShareAlike 4.0 International License

Introductory Statistics Copyright © 2024 by LOUIS: The Louisiana Library Network is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License, except where otherwise noted.