"

Chapter 9: Hypothesis Testing with Two Samples

9.3 Comparing Two Independent Population Proportions

Learning Objectives

By the end of this section, the student should be able to

  • Conduct and interpret hypothesis tests for two population proportions

Hypothesis tests for two population proportions

When conducting a hypothesis test that compares two independent population proportions, the following characteristics should be present:

  1. The two independent samples are simple random samples that are independent.
  2. The number of successes is at least five, and the number of failures is at least five, for each of the samples.
  3. Growing literature states that the population must be at least ten or 20 times the size of the sample. This keeps each population from being over-sampled and causing incorrect results.

Comparing two proportions, like comparing two means, is common. If two estimated proportions are different, it may be due to a difference in the populations or it may be due to chance. A hypothesis test can help determine if a difference in the estimated proportions reflects a difference in the population proportions.

The difference of two proportions follows an approximate normal distribution. Generally, the null hypothesis states that the two proportions are the same. That is, H0: pA = pB. To conduct the test, we use a pooled proportion, pc.

The pooled proportion is calculated as follows:

[latex]{p}_{c}=\frac{{x}_{A}+{x}_{B}}{{n}_{A}+{n}_{B}}[/latex]

The distribution for the differences is:

[latex]{{P}^{\prime }}_{A}-{{P}^{\prime }}_{B}\sim N\left[0,\sqrt{{p}_{c}\left(1-{p}_{c}\right)\left(\frac{1}{{n}_{A}}+\frac{1}{{n}_{B}}\right)}\right][/latex]

The test statistic (z-score) is:

[latex]z=\frac{\left({{p}^{\prime }}_{A}-{{p}^{\prime }}_{B}\right)-\left({p}_{A}-{p}_{B}\right)}{\sqrt{{p}_{c}\left(1-{p}_{c}\right)\left(\frac{1}{{n}_{A}}+\frac{1}{{n}_{B}}\right)}}[/latex]

Example

Two types of medication for hives are being tested to determine if there is a difference in the proportions of adult patient reactions. Twenty out of a random sample of 200 adults given medication A still had hives 30 minutes after taking the medication. Twelve out of another random sample of 200 adults given medication B still had hives 30 minutes after taking the medication. Test at a 1% level of significance.

Solution

The problem asks for a difference in proportions, making it a test of two proportions.

Let A and B be the subscripts for medication A and medication B, respectively. Then pA and pB are the desired population proportions.

Random Variable: P′A – P′B = difference in the proportions of adult patients who did not react after 30 minutes to medication A and to medication B.

H0: pA = pB

pA – pB = 0

Ha: pA ≠ pB

pA – pB ≠ 0

The words “is a difference” tell you the test is two-tailed.

Distribution for the test: Since this is a test of two binomial population proportions, the distribution is normal:

[latex]{p}_{c}=\frac{{x}_{A}+{x}_{B}}{{n}_{A}+{n}_{B}}=\frac{20+12}{200+200}=0.08\text{ }1–{p}_{c}=0.92[/latex]

[latex]{{P}^{\prime }}_{A}–{{P}^{\prime }}_{B}~N\left[0,\sqrt{\left(0.08\right)\left(0.92\right)\left(\frac{1}{200}+\frac{1}{200}\right)}\right][/latex]

P′A – P′B follows an approximate normal distribution.

Calculate the p-value using the normal distribution: p-value = 0.1404.

Estimated proportion for group A: [latex]{{p}^{\prime }}_{A}=\frac{{x}_{A}}{{n}_{A}}=\frac{20}{200}=0.1[/latex]

Estimated proportion for group B: [latex]{{p}^{\prime }}_{B}=\frac{{x}_{B}}{{n}_{B}}=\frac{12}{200}=0.06[/latex]

Graph:

Normal distribution curve of the difference in the percentages of adult patients who don't react to medication A and B after 30 minutes. The mean is equal to zero, and the values -0.04, 0, and 0.04 are labeled on the horizontal axis. Two vertical lines extend from -0.04 and 0.04 to the curve. The region to the left of -0.04 and the region to the right of 0.04 are each shaded to represent 1/2(p-value) = 0.0702.

P′A – P′B = 0.1 – 0.06 = 0.04.

Half the p-value is below –0.04, and half is above 0.04.

Compare α and the p-value: α = 0.01 and the p-value = 0.1404. α < p-value.

Make a decision: Since α < p-value, do not reject H0.

Conclusion: At a 1% level of significance, from the sample data, there is not sufficient evidence to conclude that there is a difference in the proportions of adult patients who did not react after 30 minutes to medication A and medication B.

Press STAT. Arrow over to TESTS and press
6:2-PropZTest. Arrow down and enter 20 for x1, 200 for n1, 12 for x2,
and 200 for n2. Arrow down to p1: and arrow to not equal p2. Press ENTER. Arrow down to Calculate and press ENTER. The p-value is p = 0.1404 and the test statistic is 1.47. Do the procedure again, but instead of Calculate do Draw.

Your Turn!

Two types of valves are being tested to determine if there is a difference in pressure tolerances. Fifteen out of a random sample of 100 of Valve A cracked under 4,500 psi. Six out of a random sample of 100 of Valve B cracked under 4,500 psi. Test at a 5% level of significance.

Solution

The p-value is 0.0379, so we can reject the null hypothesis. At the 5% significance level, the data support that there is a difference in the pressure tolerances between the two valves.

Example

A research study was conducted about gender differences in “sexting.” The researcher believed that the proportion of girls involved in “sexting” is less than the proportion of boys involved. The data collected in the spring of 2010 among a random sample of middle and high school students in a large school district in the southern United States is summarized in [link]. Is the proportion of girls sending sexts less than the proportion of boys “sexting?” Test at a 1% level of significance.

Males Females
Sent “sexts” 183 156
Total number surveyed 2231 2169
Solution

This is a test of two population proportions. Let M and F be the subscripts for males and females. Then pM and pF are the desired population proportions.

Random variable:p′F − p′M = difference in the proportions of males and females who sent “sexts.”

H0: pF = pM H0: pF – pM = 0

Ha: pF < pM Ha: pF – pM < 0

The words “less than” tell you the test is left-tailed.

Distribution for the test: Since this is a test of two population proportions, the distribution is normal:

[latex]{p}_{c}=\frac{{x}_{F}+{x}_{M}}{{n}_{F}+{n}_{M}}=\frac{156+183}{2169+2231}=\text{0}\text{.077}[/latex]

[latex]1-{p}_{c}=0.923[/latex]

Therefore,

[latex]{{p}^{\prime }}_{F}–{{p}^{\prime }}_{M}\sim N\left(0,\sqrt{\left(0.077\right)\left(0.923\right)\left(\frac{1}{2169}+\frac{1}{2231}\right)}\right)[/latex]

p′F – p′M follows an approximate normal distribution.

Calculate the p-value using the normal distribution:

p-value = 0.1045

Estimated proportion for females: 0.0719

Estimated proportion for males: 0.082

Graph:

This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the left of zero extends from the axis to the curve. The region under the curve to the left of the line is shaded representing p-value = 0.1045.

Decision: Since α < p-value, Do not reject H0

Conclusion: At the 1% level of significance, from the sample data, there is not sufficient evidence to conclude that the proportion of girls sending “sexts” is less than the proportion of boys sending “sexts.”

Press STAT. Arrow over to TESTS and press 6:2-PropZTest. Arrow down and enter 156 for x1, 2169 for n1, 183 for x2, and 2231 for n2. Arrow down to p1: and arrow to less than p2. Press ENTER. Arrow down to Calculate and press ENTER. The p-value is P = 0.1045 and the test statistic is z = -1.256.

Exercises

Two types of medication for hives are being tested to determine if there is a difference in the proportions of adult patient reactions. Twenty out of a random sample of 200 adults given medication A still had hives 30 minutes after taking the medication. Twelve out of another random sample of 200 adults given medication B still had hives 30 minutes after taking the medication. Test at a 1% level of significance.

Graph:

Normal distribution curve of the difference in the percentages of adult patients who don't react to medication A and B after 30 minutes. The mean is equal to zero, and the values -0.04, 0, and 0.04 are labeled on the horizontal axis. Two vertical lines extend from -0.04 and 0.04 to the curve. The region to the left of -0.04 and the region to the right of 0.04 are each shaded to represent 1/2(p-value) = 0.0702.
Figure 8.8: Medication A and B

 

 

Example

Researchers conducted a study of smartphone use among adults. A cell phone company claimed that iPhone smartphones are more popular with whites (non-Hispanic) than with African Americans. The results of the survey indicate that of the 232 African American cell phone owners randomly sampled, 5% have an iPhone. Of the 1,343 white cell phone owners randomly sampled, 10% own an iPhone. Test at the 5% level of significance. Is the proportion of white iPhone owners greater than the proportion of African American iPhone owners?

Solution

This is a test of two population proportions. Let W and A be the subscripts for the whites and African Americans. Then pW and pA are the desired population proportions.

Random variable: p′W – p′A = difference in the proportions of Android and iPhone users.

H0: pW = pA H0: pW – pA = 0

Ha: pW > pA Ha: pW – pA > 0

The words “more popular” indicate that the test is right-tailed.

Distribution for the test: The distribution is approximately normal:

[latex]{p}_{c}=\frac{{x}_{W}+{x}_{A}}{{n}_{W}+{n}_{A}}=\frac{134+12}{1343+232}=\text{ }0.0927[/latex]

[latex]1-{p}_{c}=0.9073[/latex]

Therefore,

[latex]{{p}^{\prime }}_{W}–{{p}^{\prime }}_{A}\backsim N\left(0,\sqrt{\left(0.0927\right)\left(0.9073\right)\left(\frac{1}{1343}+\frac{1}{232}\right)}\right)[/latex]

[latex]{{p}^{\prime }}_{W}–{{p}^{\prime }}_{A}[/latex] follows an approximate normal distribution.

Calculate the p-value using the normal distribution:

p-value = 0.0077

Estimated proportion for group A: 0.10

Estimated proportion for group B: 0.05<!– LALALA ♪♫♪♫ CONTINUE INSERTING NEW EXAMPLE 3 HERE –>

Graph:

This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the right of zero extends from the axis to the curve. The region under the curve to the right of the line is shaded representing p-value = 0.00004.

Decision: Since α > p-value, reject the H0.

Conclusion: At the 5% level of significance, from the sample data, there is sufficient evidence to conclude that a larger proportion of white cell phone owners use iPhones than African Americans.

TI-83+ and TI-84: Press STAT. Arrow over to TESTS and press 6:2-PropZTest. Arrow down and enter 135 for x1, 1343 for n1, 12 for x2, and 232 for n2. Arrow down to p1: and arrow to greater than p2. Press ENTER. Arrow down to Calculate and press ENTER. The P-value is P = 0.0092 and the test statistic is Z = 2.33.

Your Turn!

A concerned group of citizens wanted to know if the proportion of forcible rapes in Texas was different in 2011 than in 2010. Their research showed that of the 113,231 violent crimes in Texas in 2010, 7,622 of them were forcible rapes. In 2011, 7,439 of the 104,873 violent crimes were in the forcible rape category. Test at a 5% significance level. Answer the following questions:

a. Is this a test of two means or two proportions?

Solution

a. two proportions

b. Which distribution do you use to perform the test?

Solution

b. normal for two proportions

c. What is the random variable?

Solution

c. Subscripts: 1 = 2010, 2 = 2011

P2P2

d. What are the null and alternative hypotheses? Write the null and alternative hypothesis in symbols.

Solution

d. Subscripts: 1 = 2010, 2 = 2011

H0: p1 = p2 H0: p1p2 = 0

Ha: p1p2 Ha: p1p2 ≠ 0

e. Is this test right-, left-, or two-tailed?

Solution

e. two-tailed

f. What is the p-value?

Solution

f. p-value = 0.00086

This is a normal distribution curve with mean equal to zero. Both the right and left tails of the curve are shaded. Each tail represents 1/2(p-value) = 0.0004.

g. Do you reject or not reject the null hypothesis?

Solution

g. Reject the H0.

h. At the ___ level of significance, from the sample data, there ______ (is/is not) sufficient evidence to conclude that ____________.

Solution

h. At the 5% significance level, from the sample data, there is sufficient evidence to conclude that there is a difference between the proportion of forcible rapes in 2011 and 2010.

 

References

Data from Educational Resources, December catalog.

Data from Hilton Hotels. Available online at http://www.hilton.com (accessed June 17, 2013).

Data from Hyatt Hotels. Available online at http://hyatt.com (accessed June 17, 2013).

Data from Statistics, United States Department of Health and Human Services.

Data from Whitney Exhibit on loan to San Jose Museum of Art.

Data from the American Cancer Society. Available online at http://www.cancer.org/index (accessed June 17, 2013).

Data from the Chancellor’s Office, California Community Colleges, November 1994.

“State of the States.” Gallup, 2013. Available online at http://www.gallup.com/poll/125066/State-States.aspx?ref=interactive (accessed June 17, 2013).

“West Nile Virus.” Centers for Disease Control and Prevention. Available online at http://www.cdc.gov/ncidod/dvbid/westnile/index.htm (accessed June 17, 2013).

Media Attributions

  • Private: 8.8
definition

License

Icon for the Creative Commons Attribution-ShareAlike 4.0 International License

Introductory Statistics Copyright © 2024 by LOUIS: The Louisiana Library Network is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License, except where otherwise noted.