4.1 Probability Distribution Function (PDF) for a Discrete Random Variable

Jared Eusea; Phyllis Okwan; Rachid Belmasrour; Stephan Patterson; Stephen Andrus

Chapter 4: Discrete Random Variables

4.1 Probability Distribution Function (PDF) for a Discrete Random Variable

Learning Objectives

By the end of this section, you should be able to:

Validate the properties of a discrete probability distribution function
Write a probability distribution function for given discrete data

Random Variables

Random variables are probability models quantifying situations. Upper case letters such as [latex]X[/latex] or [latex]Y[/latex] denote a random variable. Lower case letters like [latex]x[/latex] or [latex]y[/latex] denote the value of a random variable. If [latex]X[/latex] is a random variable, then [latex]X[/latex] is written in words, and [latex]x[/latex] is given as a number.

For example, let [latex]X[/latex] = the number of heads you get when you toss three fair coins. The sample space for the toss of three fair coins is TTT; THH; HTH; HHT; HTT; THT; TTH; HHH. Then, in these outcomes, we can have no heads (when TTT occurs) or up to three heads (when HHH occurs). Therefore [latex]x = 0, 1, 2,[/latex] and [latex]3[/latex].

Observe again that [latex]X[/latex] is in words and [latex]x[/latex] is a number. For this example, the [latex]x[/latex] values are a countable number of outcomes, having four possibilities. Because you can count the possible values that [latex]X[/latex] can take on and the outcomes are random, [latex]X[/latex] is a discrete random variable.

We have seen the word discrete before associated with types of data: discrete means we have a countable number of outcomes. So a discrete random variable is a random variable that models a process or experiment that produces discrete data.

There are both continuous and discrete random variables. We will begin with discrete random variables and examine continuous random variables in the next chapter.

Characteristics and Notation

The probability mass function (PMF) of a discrete random variable tells you the probability of each outcome in the sample space. The probability that random variable [latex]X[/latex] takes on value [latex]x[/latex] is represented by [latex]P(X = x)[/latex] or just [latex]P(x)[/latex]. The PMF is also sometimes called the probability distribution function (PDF).

The distribution of a discrete random variable is often pictured in a table, but may also be represented by a graph or formula. The two main characteristics the PMF should exhibit are:

Each probability is between zero and one, inclusive.
The sum of the probabilities is one.

Example

A child psychologist is interested in the number of times a newborn baby's crying wakes one of their parents after midnight. For a random sample of 50 families, the following information was obtained. Let X = the number of times per week a newborn baby's crying wakes one of their parents after midnight. In the sample, the largest number of times observed during a week was 5, so x = 0, 1, 2, 3, 4, 5.

P(x) = probability that X takes on a value x.

Figure 4.2: Newborn Baby Crying
x	P(x)
0	P(0) = [latex]\frac{2}{50}[/latex]
1	P(1) = [latex]\frac{11}{50}[/latex]
2	P(2) = [latex]\frac{23}{50}[/latex]
3	P(3) = [latex]\frac{9}{50}[/latex]
4	P(4) = [latex]\frac{4}{50}[/latex]
5	P(5) = [latex]\frac{1}{50}[/latex]

Is this a valid discrete probability distribution?

Solution

[latex]X[/latex] takes on the values 0, 1, 2, 3, 4, 5. This is a valid discrete PDF because:

Each [latex]P(x)[/latex] is between zero and one, inclusive.
The sum of the probabilities is one, that is,

[latex]\frac{2}{50} + \frac{11}{50} + \frac{23}{50} + \frac{9}{50} + \frac{4}{50} + \frac{1}{50} = 1[/latex]

Your Turn!

A hospital researcher is interested in the number of times the average post-op patient will ring the nurse during a 12-hour shift. For a random sample of 50 patients, the following information was obtained. Let X = the number of times a patient rings the nurse during a 12-hour shift. The largest number of patient calls was 5, so x = 0, 1, 2, 3, 4, 5.

Figure 4.3: Post-Op Patients
X	P(x)
0	P(0) = [latex]\frac{4}{50}[/latex]
1	P(1) = [latex]\frac{8}{50}[/latex]
2	P(2) = [latex]\frac{16}{50}[/latex]
3	P(3) = [latex]\frac{14}{50}[/latex]
4	P(4) = [latex]\frac{6}{50}[/latex]
5	P(5) = [latex]\frac{2}{50}[/latex]

Why is this a discrete probability distribution function (two reasons)?

Solution

Each [latex]P(x)[/latex] is between 0 and 1, inclusive.

The sum of the probabilities is 1, that is: [latex]\frac{4}{50} + \frac{8}{50} + \frac{16}{50} + \frac{14}{50} + \frac{6}{50} + \frac{2}{50} = 1[/latex].

The cumulative distribution function (CDF) of a discrete random variable tells you the probability of being less than or equal to a value, denoted [latex]P(X \leq x)[/latex].

A probability distribution function is a pattern. You try to fit a probability problem into a pattern or distribution in order to perform the necessary calculations. These distributions are tools to make solving probability problems easier. Each distribution has its own special characteristics. Learning the characteristics enables you to distinguish among the different distributions.

Example

Suppose Nancy has classes three days a week. She attends classes three days a week 80% of the time, two days 15% of the time, one day 4% of the time, and no days 1% of the time. Suppose one week is randomly selected.

a. Let X = the number of days Nancy [latex]\underline{\hspace{2cm}}[/latex].

Solution

attends classes per week

b. [latex]X[/latex] takes on what values?

Solution

0, 1, 2, and 3

c. Suppose one week is randomly chosen. Construct a probability distribution table (called a PDF table) like the one below. The table should have two columns labeled x and P(x). What does the [latex]P(x)[/latex] column sum to?

Figure 4.4: Blank PDF
x	P(x)
0
1
2
3

Solution

Figure 4.4: Complete
x	P(x)
0	0.01
1	0.04
2	0.15
3	0.80

Your Turn!

Jeremiah has basketball practice two days a week. Ninety percent of the time, he attends both practices. Eight percent of the time, he attends one practice. Two percent of the time, he does not attend either practice. What is X and what values does it take on?

Section 4.1 Review

The characteristics of a probability distribution function (PDF) for a discrete random variable are as follows:

Each probability is between zero and one, inclusive (inclusive means to include zero and one).
The sum of the probabilities is one.

Section 4.1 Practice

Use the following information to answer the next four exercises. A baker is deciding how many batches of muffins to make to sell in his bakery. He wants to make enough to sell every one and no fewer. Through observation, the baker has established a probability distribution.

x	P(x)
1	0.15
2	0.35
3	0.40
4	0.10

1. Define the random variable [latex]X[/latex].

Solution

Let [latex]X =[/latex]the number of batches that the baker will sell.

2. What is the probability the baker will sell more than one batch? [latex]P(x > 1) =\underline{\hspace{2cm}}[/latex]

Solution

[latex]0.35 + 0.40 + 0.10 = 0.85[/latex]

3. What is the probability the baker will sell exactly one batch? [latex]P(x = 1) = \underline{\hspace{2cm}}[/latex]

Solution

0.15

4. On average, how many batches should the baker make?

Solution

[latex]1(0.15) + 2(0.35) + 3(0.40) + 4(0.10) = 0.15 + 0.70 + 1.20 + 0.40 = 2.45[/latex]

Use the following information to answer the next three exercises. Ellen has music practice three days a week. She practices for all of the three days 85% of the time, two days 8% of the time, one day 4% of the time, and no days 3% of the time. One week is selected at random.

1. Define the random variable [latex]X[/latex].

Solution

Let [latex]X =[/latex] the number of days Ellen attends practice per week.

2. Construct a probability distribution table for the data.

Solution

x	P(x)
0	0.03
1	0.04
2	0.08
3	0.85

3. We know that for a probability distribution function to be discrete, it must have two characteristics. One is that the sum of the probabilities is one. What is the other characteristic?

Solution

Each probability is between zero and one, inclusive.

Use the following information to answer the next five exercises. Javier volunteers in community events each month. He does not do more than five events in a month. He attends exactly five events 35% of the time, four events 25% of the time, three events 20% of the time, two events 10% of the time, one event 5% of the time, and no events 5% of the time.

1. Define the random variable [latex]X[/latex].

Solution

Let [latex]X =[/latex] the number of events Javier volunteers for each month.

2. What values does [latex]x[/latex] take on?

Solution

0, 1, 2, 3, 4, 5

3. Construct a PDF table.

Solution

x	P(x)
0	0.05
1	0.05
2	0.10
3	0.20
4	0.25
5	0.35

4. Find the probability that Javier volunteers for less than three events each month.

Solution

[latex]P(x \lt 3) = 0.05 + 0.05 + 0.10 = 0.20[/latex]

5. Find the probability that Javier volunteers for at least one event each month.

Solution

[latex]P(x>0)1 – 0.05 = 0.95[/latex]

Suppose that the PDF for the number of years it takes to earn a Bachelor of Science (B.S.) degree is given in the table below.

x	P(x)
3	0.05
4	0.40
5	0.30
6	0.15
7	0.10

In words, define the random variable [latex]X[/latex].
What does it mean that the values zero, one, and two are not included for [latex]x[/latex] in the PDF?

License

Icon for the Creative Commons Attribution-ShareAlike 4.0 International License

Random Variables

Characteristics and Notation

Section 4.1 Review

Section 4.1 Practice

License

Share This Book