Chapter 11: The Chi-Square Distribution
11.4 Test for Homogeneity
Learning Objectives
By the end of this section, the student should be able to:
- Conduct and interpret chi-square homogeneity hypothesis tests.
The goodness–of–fit test can be used to decide whether a population fits a given distribution, but it will not suffice to decide whether two populations follow the same unknown distribution. A different test, called the test for homogeneity, can be used to draw a conclusion about whether two populations have the same distribution. To calculate the test statistic for a test for homogeneity, follow the same procedure as with the test of independence.
Note
The expected value for each cell needs to be at least five in order for you to use this test.
Hypotheses:
- [latex]H_0:[/latex] The distributions of the two populations are the same.
- [latex]H_a:[/latex] The distributions of the two populations are not the same.
Test Statistic:
Use a [latex]{\chi }^{2}[/latex] test statistic. It is computed in the same way as the test for independence.
Degrees of Freedom: [latex]df = \text{number of columns} - 1[/latex]
Requirements: All values in the table must be greater than or equal to five.
Common Uses: Comparing two populations. For example: men vs. women, before vs. after, east vs. west. The variable is categorical with more than two possible response values.
Example
Do male and female college students have the same distribution of living arrangements? Use a level of significance of 0.05. Suppose that 250 randomly selected male college students and 300 randomly selected female college students were asked about their living arrangements: dormitory, apartment, with parents, other. The results are shown in the table below. Do male and female college students have the same distribution of living arrangements?
| Dormitory | Apartment | With Parents | Other | |
|---|---|---|---|---|
| Males | 72 | 84 | 49 | 45 |
| Females | 91 | 86 | 88 | 35 |
Solution
[latex]H_0:[/latex] The distribution of living arrangements for male college students is the same as the distribution of living arrangements for female college students.
[latex]H_a:[/latex] The distribution of living arrangements for male college students is not the same as the distribution of living arrangements for female college students.
Degrees of Freedom: [latex]df = \text{number of columns} – 1 = 4 – 1 = 3[/latex]
Distribution for the test: [latex]{\chi }_{3}^{2}[/latex]
Calculate the test statistic: [latex]\chi^2 = 10.1287[/latex]
Probability statement: [latex]\text{p-value} = P(\chi^2 >10.1287) = 0.0175[/latex]
Use the probability table found in the Back Matter - Statistics Tables where needed.
Compare [latex]\alpha[/latex] and the [latex]\text{p-value}[/latex]: Since no [latex]\alpha[/latex] is given, assume [latex]\alpha = 0.05[/latex]. [latex]\text{p-value} = 0.0175[/latex], so [latex]\alpha > \text{p-value}[/latex].
Make a decision: Since [latex]\alpha > \text{p-value}[/latex], reject [latex]H_0[/latex]. This means that the distributions are not the same.
Conclusion: At a 5% level of significance, from the data, there is sufficient evidence to conclude that the distributions of living arrangements for male and female college students are not the same.
Notice that the conclusion is only that the distributions are not the same. We cannot use the test for homogeneity to draw any conclusions about how they differ.
Your Turn!
Do families and singles have the same distribution of cars? Use a level of significance of 0.05. Suppose that 100 randomly selected families and 200 randomly selected singles were asked what type of car they drove: sport, sedan, hatchback, truck, van/SUV. The results are shown in the table below. Do families and singles have the same distribution of cars? Test at a level of significance of 0.05.
| Sport | Sedan | Hatchback | Truck | Van/SUV | |
|---|---|---|---|---|---|
| Family | 5 | 15 | 35 | 17 | 28 |
| Single | 45 | 65 | 37 | 46 | 7 |
Solution
With a p-value of almost zero, we reject the null hypothesis. The data show that the distribution of cars is not the same for families and singles.
Example
Both before and after a recent earthquake, surveys were conducted asking voters which of the three candidates they planned on voting for in the upcoming city council election. Has there been a change since the earthquake? Use a level of significance of 0.05. The following table shows the results of the survey. Has there been a change in the distribution of voter preferences since the earthquake?
| Perez | Chung | Stevens | |
|---|---|---|---|
| Before | 167 | 128 | 135 |
| After | 214 | 197 | 225 |
Solution
[latex]H_0[/latex]: The distribution of voter preferences was the same before and after the earthquake.
[latex]H_a[/latex]: The distribution of voter preferences was not the same before and after the earthquake.
Degrees of Freedom: [latex]df = \text{number of columns} – 1 = 3 – 1 = 2[/latex]
Distribution for the test: [latex]{\chi }_{2}^{2}[/latex]
Calculate the test statistic: [latex]\chi^2 = 3.2603[/latex]
Probability statement: [latex]\text{p-value}=P(\chi^2 > 3.2603) = 0.1959[/latex]
Compare [latex]\alpha[/latex] and the p-value: [latex]\alpha = 0.05[/latex] and the [latex]\text{p-value} = 0.1959[/latex], so [latex]\alpha \lt \text{p-value}[/latex].
Make a decision: Since [latex]\alpha \lt \text{p-value}[/latex], do not reject [latex]H_0[/latex].
Conclusion: At a 5% level of significance, from the data, there is insufficient evidence to conclude that the distribution of voter preferences was not the same before and after the earthquake.
Your Turn!
Ivy League schools receive many applications, but only some can be accepted. At the schools listed in the table below, two types of applications are accepted: regular and early decision.
| Application Type Accepted | Brown | Columbia | Cornell | Dartmouth | Penn | Yale |
|---|---|---|---|---|---|---|
| Regular | 2,115 | 1,792 | 5,306 | 1,734 | 2,685 | 1,245 |
| Early Decision | 577 | 627 | 1,228 | 444 | 1,195 | 761 |
We want to know if the number of regular applications accepted follows the same distribution as the number of early applications accepted. State the null and alternative hypotheses, the degrees of freedom and the test statistic, sketch the graph of the p-value, and draw a conclusion about the test of homogeneity.
Solution
[latex]H_0:[/latex] The distribution of regular applications accepted is the same as the distribution of early applications accepted.
[latex]H_a[/latex]: The distribution of regular applications accepted is not the same as the distribution of early applications accepted.
[latex]df = 5[/latex]
[latex]\chi^2 \text{test statistic} = 430.06[/latex]

Videos
Below are helpful videos for the content covered in this Section. Videos are provided from YouTube.
Section 11.4 Review
To assess whether two data sets are derived from the same distribution—which need not be known, you can apply the test for homogeneity that uses the chi-square distribution. The null hypothesis for this test states that the populations of the two data sets come from the same distribution. The test compares the observed values against the expected values if the two populations followed the same distribution. The test is right-tailed. Each observation or cell category must have an expected value of at least five.
Formula Review
- Homogeneity test statistic: [latex]\sum _{i\cdot j}\frac{{\left(O-E\right)}^{2}}{E}[/latex], where:
- [latex]O=[/latex] observed values
- [latex]E=[/latex] expected values
- [latex]i=[/latex] number of rows in data contingency table
- [latex]j=[/latex] number of columns in data contingency table
- Degrees of freedom: [latex]df = (i −1)(j −1)[/latex]
Section 11.4 Practice
A math teacher wants to see if two of her classes have the same distribution of test scores. What test should she use?
Solution
test for homogeneity
A market researcher wants to see if two different stores have the same distribution of sales throughout the year. What type of test should he use?
Solution
test for homogeneity
A meteorologist wants to know if East and West Australia have the same distribution of storms. What type of test should she use?
What condition must be met to use the test for homogeneity?
Solution
All values in the table must be greater than or equal to five.
Do private practice doctors and hospital doctors have the same distribution of working hours? Suppose that a sample of 100 private practice doctors and 150 hospital doctors are selected at random and asked about the number of hours a week they work. The results are shown in the table below.
| 20–30 | 30–40 | 40–50 | 50–60 | |
|---|---|---|---|---|
| Private Practice | 16 | 40 | 38 | 6 |
| Hospital | 8 | 44 | 59 | 39 |
- State the null and alternative hypotheses.
- [latex]df = \underline{\hspace{2cm}}[/latex]
- What is the test statistic?
- What is the p-value?
- What can you conclude at the 5% significance level?
A psychologist is interested in testing whether there is a difference in the distribution of personality types for business majors and social science majors. The results of the study are shown in the following table. Conduct a test of homogeneity. Test at a 5% level of significance.
| Open | Conscientious | Extrovert | Agreeable | Neurotic | |
|---|---|---|---|---|---|
| Business | 41 | 52 | 46 | 61 | 58 |
| Social Science | 72 | 75 | 63 | 80 | 65 |
Use the Chi-Square Distribution - Solution Sheet on the Introduction to Chapter 11 page. (Round expected frequency to two decimal places.)
Solution
- [latex]H_0:[/latex] The distribution for personality types is the same for both majors
- [latex]H_a:[/latex] The distribution for personality types is not the same for both majors
- [latex]df = 4[/latex]
- chi-square with [latex]df = 4[/latex]
- test statistic = 3.01
- p-value = 0.5568
- Check student’s solution.
- Correct decision (“reject” or “do not reject” the null hypothesis) and appropriate conclusions:
- Alpha: 0.05
- Decision: Do not reject the null hypothesis.
- Reason for decision: p-value > alpha
- Conclusion: There is insufficient evidence to conclude that the distribution of personality types is different for business and social science majors.
Do men and women select different breakfasts? The breakfasts ordered by randomly selected men and women at a popular breakfast place is shown in the table below. Conduct a test for homogeneity at a 5% level of significance.
| French Toast | Pancakes | Waffles | Omelets | |
|---|---|---|---|---|
| Men | 47 | 35 | 28 | 53 |
| Women | 65 | 59 | 55 | 60 |
Use the Chi-Square Distribution - Solution Sheet on the Introduction to Chapter 11 page. (Round expected frequency to two decimal places.)
A fisherman is interested in whether the distribution of fish caught in Green Valley Lake is the same as the distribution of fish caught in Echo Lake. Of the 191 randomly selected fish caught in Green Valley Lake, 105 were rainbow trout, 27 were other trout, 35 were bass, and 24 were catfish. Of the 293 randomly selected fish caught in Echo Lake, 115 were rainbow trout, 58 were other trout, 67 were bass, and 53 were catfish. Perform a test for homogeneity at a 5% level of significance.
Use the Chi-Square Distribution - Solution Sheet on the Introduction to Chapter 11 page. (Round expected frequency to two decimal places.)
Solution
- [latex]H_0:[/latex] The distribution for fish caught is the same in Green Valley Lake and in Echo Lake.
- [latex]H_a:[/latex] The distribution for fish caught is not the same in Green Valley Lake and in Echo Lake.
- latex]df = 3[/latex]
- chi-square with [latex]df = 3[/latex]
- 11.75
- p-value = 0.0083
- Check student’s solution.
- Correct decision (“reject” or “do not reject” the null hypothesis) and appropriate conclusions:
- Alpha: 0.05
- Decision: Reject the null hypothesis.
- Reason for decision: p-value < alpha
- Conclusion: There is evidence to conclude that the distribution of fish caught is different in Green Valley Lake and in Echo Lake
References
Data from the Insurance Institute for Highway Safety, 2013. Available online at www.iihs.org/iihs/ratings (accessed May 24, 2013).
“Energy use (kg of oil equivalent per capita).” The World Bank, 2013. Available online at http://data.worldbank.org/indicator/EG.USE.PCAP.KG.OE/countries (accessed May 24, 2013).
“Parent and Family Involvement Survey of 2007 National Household Education Survey Program (NHES),” U.S. Department of Education, National Center for Education Statistics. Available online at http://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2009030 (accessed May 24, 2013).
“Parent and Family Involvement Survey of 2007 National Household Education Survey Program (NHES),” U.S. Department of Education, National Center for Education Statistics. Available online at http://nces.ed.gov/pubs2009/2009030_sup.pdf (accessed May 24, 2013).