<- sample(0:3, size = 10000, replace = TRUE, prob = c(1/4, 1/2, 1/8, 1/8))
x mean(x == 1)
[1] 0.5007
Let X be a discrete random variable with probability mass function given by
p(x) = \begin{cases} \frac{1}{4} & x = 0 \\ \frac{1}{2} & x = 1 \\ \frac{1}{8} & x = 2 \\ \frac{1}{8} & x = 3 \\ \end{cases}
To be a valid probability function p(x) must satisfy the following conditions:
The first condition is satisfied by construction. The second condition is satisfied by the following calculation:
\sum_{x} p(x) = \frac{1}{4} + \frac{1}{2} + \frac{1}{8} + \frac{1}{8} = 1
We can use the union rule as:
P(X \geq 2) = P(X = 2) + P(X = 3)
P(X \geq 2) = \frac{1}{8} + \frac{1}{8} = \frac{1}{4}
We can use conditional probability as:
P(X \geq 2 | X \geq 1) = \frac{P(X \geq 2 \cap X \geq 1)}{P(X \geq 1)}
Since X \geq 2 is a subset of X \geq 1, we have that:
P(X \geq 2 \cap X \geq 1) = P(X \geq 2)
And Finally:
P(X \geq 2 | X \geq 1) = \frac{\frac{1}{4}}{\frac{3}{4}} = \frac{1}{3}
We can use the union rule as:
P(X \geq 1 \cup X \geq 2) = P(X \geq 1) + P(X \geq 2) \\ - P(X \geq 1 \cap X \geq 2)
Then, similarly to part (c), we can use the intersection rule as:
P(X \geq 1 \cap X \geq 2) = (X \geq 2)
P(X \geq 1 \cap X \geq 2) = \frac{1}{8} + \frac{1}{8} = \frac{1}{4}
And finally:
P(X \geq 1 \cup X \geq 2) = \frac{3}{4} + \frac{1}{4} - \frac{1}{4} = \frac{3}{4}
Let X be a discrete random variable with probability mass function given by
p(x) = \begin{cases} \frac{1}{4} & x = 0 \\ \frac{1}{2} & x = 1 \\ \frac{1}{8} & x = 2 \\ \frac{1}{8} & x = 3 \\ \end{cases}
sample
to create a sample of size 10000 from X and estimate $P(X = 1) from your sample. Your result should be close to 1/2.table
on your sample from part (a) to estimate the pmf of X from your sample. Your result should be similar to the pmf given in the problem.<- sample(0:3, size = 10000, replace = TRUE, prob = c(1/4, 1/2, 1/8, 1/8))
x mean(x == 1)
[1] 0.5007
<- sample(0:3, size = 10000, replace = TRUE, prob = c(1/4, 1/2, 1/8, 1/8))
x table(x) / 10000
x
0 1 2 3
0.2513 0.4970 0.1206 0.1311
Let X be a discrete random variable with probability mass function given by
p(x) = \begin{cases} \frac{C}{4} & x = 0 \\ \frac{C}{2} & x = 1 \\ C & x = 2 \\ 0 & \text{otherwise} \end{cases}
Find the value of C that makes p(x) is a probability mass function.
\sum_{x} p(x) = \frac{C}{4} + \frac{C}{2} + C = 1
C = \frac{4}{7}
Give an example of a probability mass function p whose associated random variable has mean 0.
By definition, the mean of a random variable is:
\mu = \sum_{x} x p(x)
We know the pmf is positive and the sum of all probabilities is 1, so we can choose any symetric random variable over the real numbers, with equal probabilities for both sides.
p(x) = \begin{cases} \frac{1}{4} & x = -2 \\ \frac{1}{4} & x = -1 \\ \frac{1}{4} & x = 1 \\ \frac{1}{4} & x = 2 \\ 0 & \text{otherwise} \end{cases}
The mean is:
\mu = -2 \frac{1}{4} + -1 \frac{1}{4} + 1 \frac{1}{4} + 2 \frac{1}{4} = 0
Find the mean of the random variable X given in Exercise 3.1.
\mu = \sum_{x} x p(x) = 0 \frac{1}{4} + 1 \frac{1}{2} + 2 \frac{1}{8} + 3 \frac{1}{8} = \frac{9}{8}
Let X be a random variable with pmf given by p(x) = 1/10 for x = 1, \ldots, 10 and p(x) = 0 for all other values of x. Find E[X] and confirm your answer using simulation.
E[x] = \sum_{x} x p(x) = 1 \frac{1}{10} + 2 \frac{1}{10} + \ldots + 10 \frac{1}{10} = 5.5
<- sample(1:10, size = 10000, replace = TRUE, prob = rep(1/10, 10))
x mean(x)
[1] 5.5081
Suppose you roll two ordinary dice. Calculate the expected value of their product.
First we need to find the pmf of the product of two dice:
Product | Outcomes | Probability |
---|---|---|
1 | (1, 1) | 1/36 |
2 | (1, 2), (2, 1) | 2/36 |
3 | (1, 3), (3, 1) | 2/36 |
4 | (1, 4), (4, 1), (2, 2) | 3/36 |
5 | (1, 5), (5, 1) | 2/36 |
6 | (1, 6), (6, 1), (2, 3), (3, 2) | 4/36 |
8 | (2, 4), (4, 2) | 2/36 |
9 | (3, 3) | 1/36 |
10 | (2, 5), (5, 2) | 2/36 |
12 | (3, 4), (4, 3), (6, 2), (2, 6) | 4/36 |
15 | (3, 5), (5, 3) | 2/36 |
16 | (4, 4) | 1/36 |
18 | (6, 3), (3, 6) | 2/36 |
20 | (5, 4), (4, 5) | 2/36 |
24 | (6, 4), (4, 6) | 2/36 |
25 | (5, 5) | 1/36 |
30 | (6, 5), (5, 6) | 2/36 |
36 | (6, 6) | 1/36 |
And then we can calculate the expected value:
E[x] = 1 \frac{1}{36} + 2 \frac{2}{36} + \ldots + 36 \frac{1}{36} = 12.25
We can also use simulation to confirm our answer:
<- sample(1:6, size = 10000, replace = TRUE)
dice_1 <- sample(1:6, size = 10000, replace = TRUE)
dice_2 mean(dice_1 * dice_2)
[1] 12.2766
Suppose that a hat contains slips of papers containing the numbers 1, 2, and 3. Two slips of paper are drawn without replacement. Calculate the expected value of the product of the numbers on the slips of paper.
We need to find the pmf:
Product | Outcomes | Probability |
---|---|---|
2 | (2, 1), (1, 2) | 2/6 |
3 | (3, 1), (1, 3) | 2/6 |
6 | (3, 2), (2, 3) | 2/6 |
And then we can calculate the expected value:
E[x] = 2 \frac{2}{6} + 3 \frac{2}{6} + 6 \frac{2}{6} = \frac{22}{6}
We can also use simulation to confirm our answer:
<- replicate(100000, {
event <- sample(1:3, size = 2, replace = FALSE)
hat prod(hat)
})mean(event)
[1] 3.66855
Pick an integer from 0 to 999 with all possible numbers equally likely. What is the expected number of digits in your number?
We need to find the pmf:
Number of digits | Outcomes | Probability |
---|---|---|
1 | 0-9 | 10/1000 |
2 | 10-99 | 90/1000 |
3 | 100-999 | 900/1000 |
And then we can calculate the expected value:
E[x] = 1 \frac{10}{1000} + 2 \frac{90}{1000} + 3 \frac{900}{1000} = 2.89
We can also use simulation to confirm our answer:
<- sample(0:999, size = 100000, replace = TRUE)
number mean(nchar(as.character(number)))
[1] 2.88844
In the summer of 2020, the U.S. was considering pooled testing of COVID-19. This problem explores the math behind pooled testing. Since the availability of tests is limited, the testing center proposes the following pooled testing technique:
Suppose in a certain population, 5% of the people being tested for COVID-19 actually have COVID-19. Let X be the total number of tests that are run in order to test two randomly selected people.
To find the pmf of X, we need to determine the probability of each possible value of X:
First, consider the case where neither of the two individuals selected has COVID-19. The probability that the first individual selected does not have COVID-19 is 0.95, and the probability that the second individual selected does not have COVID-19, given that the first individual does not have COVID-19, is also 0.95. Therefore, the probability that neither individual has COVID-19 is:
P = 0.95 \cdot 0.95 = 0.9025
In this case, only one test is needed, since the combined sample will test negative and both individuals can be assumed negative. So, in this case, X = 1 with probability 0.9025.
Now, consider the case where one or both of the individuals selected has COVID-19. The probability that at least one has COVID-19 is the complement of the probability that neither has COVID-19: P = 1 - 0.9025 = 0.0975
In this case, the combined sample will test positive, and both individuals will need to be retested. Since each individual will need to be tested once more, a total of three tests will be needed. So, in this case, X = 3 with probability 0.0975.
X | Probability |
---|---|
1 | 0.9025 |
3 | 0.0975 |
E[x] = 1 \cdot 0.9025 + 3 \cdot 0.0975 = 1.195
To find the pmf of Y, we need to determine the probability of each possible value of Y:
P = 0.95 \cdot 0.95 \cdot 0.95 = 0.857375
In this case, only one test is needed, since the combined sample will test negative and all three individuals can be assumed negative. So, in this case, Y = 1 with probability 0.857375.
Now consider the case where one or more of the individuals selected has COVID-19. The probability that at least one individual has COVID-19 is:
P = 1 - 0.857375 = 0.142625
In this case, the combined sample will test positive, and all three individuals will need to be retested. Since each individual will need to be tested once more, a total of four tests will be needed. So, in this case, Y = 4 with probability 0.142625.
Y | Probability |
---|---|
1 | 0.857375 |
4 | 0.142625 |
And the expected value of Y is:
E[y] = 1 \cdot 0.857375 + 4 \cdot 0.142625 = 1.442
Since the expected value of X is 1.195 and the expected value of Y is 1.442, we would recommend the technique that minimizes the expected number of tests, which is the technique that tests two samples at a time.
A roulette wheel has 38 slots and a ball that rolls until it falls into one of the slots, all of which are equally likely. Eighteen slots are black numbers, eighteen are red numbers, and two are green zeros. If you bet on “red” and the ball lands in a red slot, the casino pays you your bet; otherwise, the casino wins your bet.
We need to find the probability of winning, which is the probability that the ball lands in a red slot. There are 18 red slots, so the probability of winning is:
P = \frac{18}{38} = 0.4736
The pmf of the random variable X is:
X | Probability |
---|---|
-1 | 0.5263 |
1 | 0.4736 |
And the expected value of X is:
E[x] = -1 \cdot 0.5263 + 1 \cdot 0.4736 = -0.0527
Under this betting plan, there are three possible outcomes:
The probability of each of these outcomes can be calculated as follows:
P = \frac{18}{38} \cdot \frac{18}{38} = 0.2243
The pmf of the random variable X is:
X | Probability |
---|---|
-2 | 0.2493 |
-1 | 0.5263 |
3 | 0.2243 |
And the expected value of X is:
E[x] = -2 \cdot 0.2493 - 1 \cdot 0.5263 + 3 \cdot 0.2243 = -0.352
One (questionable) roulette strategy is called bet doubling: You bet $1 on red, and if you win, you pocket the $1. If you lose, you double your bet so you are now betting $2 on red, but have lost $1. If you win, you win $2 for a $1 profit, which you pocket. If you lose again, you double your bet to $4 (having already lost $3). Again, if you win, you have $1 profit, and if you lose, you double your bet again. This guarantees you will win $1, unless you run out of money to keep doubling your bet.
The maximum number of bets you can lose in a row without losing your bankroll is 7.
The proability of losing all 7 bets is (\frac{20}{38})^7. Thus the probability of winning at least once is:
P = 1 - (\frac{20}{38})^7 = 0.9881
Based on the previous part, we can built the pmf of the random variable X:
X | Probability |
---|---|
-127 | 0.0119 |
1 | 0.9881 |
And the expected value of X is:
E[x] = -127 \cdot 0.0119 + 1 \cdot 0.9881 = - 0.5232
we can simulate this in R as follows:
<- replicate(10000 , {
event <- 127
bankroll <- 0
num_rounds <- 1
nth_bet while (bankroll > 0) {
<- num_rounds + 1
num_rounds <- sample(x = c(0, 1), size = 1, prob = c(20/38, 18/38))
bet_result if (bet_result == 0) {
<- bankroll - 2^(nth_bet - 1)
bankroll <- nth_bet + 1
nth_bet else {
} <- bankroll + 2^(nth_bet - 1)
bankroll <- 1
nth_bet
}
}
num_rounds
})
mean(event)
[1] 991.8872
Flip a fair coin 10 times and let X be the proportion of times that a head is followed by another head. Discard the sequence of ten tosses if you don’t obtain a head in the first nine tosses. What is the expected value of X? (Note: this is not asking you to estimate the conditional probability of getting heads given that you obtained heads).
Let X_{j} be the indicator random variable of HH appearing at position j for j = 1, 2, \dots, 9. These events aren’t quite independent from each other, so the binomial random variable does not accurante describe the distribution of X. However, we can still use the Binomial random variable to calculate the expected value of X, because the expectation is linear.
Getting two heads in a row has probability equal to p = 1/4. So, we look at HH is our success outcome and the rest \{HT, TH, TT \} consist of the failures outcomes. Since we are discarding the sequence of ten tosses if we don’t obtain a head in the first nine tosses, we have n = 9 and p = 1/4. Thus, the expected value of X is:
E[X] = n \cdot p = 9 \cdot \frac{1}{4} = 2.25
To play the Missouri lottery Pick 3, you choose three digits 0-9 in order. Later, the lottery selects three digits at random, and you win if your choices match the lottery values in some way. Here are some possible bets you can play:
Consider the value of your bet to be your expected winnings per dollar bet. What value do each of these bets have?
$1 Straight wins if you correctly guess all three digits, in order. It pays $600.
The total amount of money you can win is $(600 - 1). The total amount of money you can lose is $1. The proability of winning is \frac{1}{n} \cdot \frac{1}{n} \cdot \frac{1}{n} where n=10 and the probability of losing is the complement. Thus, the expected value of this bet is:
E[X] = (600 - 1) \cdot \frac{1}{1000} - 1 \cdot \left(1 - \frac{1}{1000}\right) = -0.4
We can also simulate this in R as follows:
<- replicate(100000 , {
event <- sample(x = 0:999, size = 1, replace = TRUE)
my_bet <- sample(x = 0:999, size = 1, replace = TRUE)
lottery if(lottery == my_bet) 599 else -1
})
mean(event)
[1] -0.37
$1 Front Pair wins if you correctly guess the first two digits. It pays $60.
Quite similar to the previous part, the total amount of money you can win is $(60 - 1). The total amount of money you can lose is $1. The proability of winning is \frac{1}{n} \cdot \frac{1}{n} where n=10 and the probability of losing is the complement. Thus, the expected value of this bet is:
E[X] = (60 - 1) \cdot \frac{1}{100} - 1 \cdot \left(1 - \frac{1}{100}\right) = -0.4
We can also simulate this in R as follows:
<- replicate(100000 , {
event <- sample(x = 0:99, size = 1, replace = TRUE)
my_bet <- sample(x = 0:999, size = 1, replace = TRUE)
lottery if(lottery %/% 10 == my_bet) 59 else -1
})
mean(event)
[1] -0.3958
$1 Back Pair wins if you correctly guess the last two digits. It pays $60.
It’s basically the same. So, let’s just calculate the expected value:
E[X] = (60 - 1) \cdot \frac{1}{100} - 1 \left(1 - \frac{1}{100}\right) = -0.4
And just for the record, here is the simulation:
<- replicate(100000 , {
event <- sample(x = 0:99, size = 1, replace = TRUE)
my_bet <- sample(x = 0:999, size = 1, replace = TRUE)
lottery if(lottery %% 100 == my_bet) 59 else -1
})
mean(event)
[1] -0.4174
$6 6-Way Combo wins if the three digits are different and you guess all three in any order. It pays $600.
With a 3-digit number (All digits different) we have 6 (3!) ways to guess all three digits in any order. So it means that the probability of winning is \frac{6}{1000}. Thus the expected value is:
E[X] = (600 - 6) \cdot \frac{6}{1000} - 6 \cdot \left(1 - \frac{6}{1000}\right) = -2.4
We can simulate this in R as follow:
<- replicate(100000, {
event <- sample(x = 0:9, size = 3, replace = FALSE)
my_bet <- sample(x = 0:9, size = 3, replace = TRUE)
lottery if(all(my_bet %in% lottery)) 594 else -6
})
mean(event)
[1] -2.202
$3 3-Way Combo wins if two of the three digits are the same, and you guess all three in any order. It pays $600.
With a 3-digit number where 2 digits are the same we have 3 ways to guess all three digits in any order. So it means that the probability of winning is \frac{3}{1000}. Thus the expected value is:
E[X] = (600 - 3) \cdot \frac{3}{1000} - 3 \cdot \left(1 - \frac{3}{1000}\right) = -1.2
We can simulate this in R as follow:
<- replicate(100000, {
event <- sample(x = 0:9, size = 2, replace = FALSE)
my_bet <- c(my_bet, my_bet[1])
my_bet <- sample(x = 0:9, size = 3, replace = TRUE)
lottery if(identical(sort(my_bet), sort(lottery))) 597 else -3
})
mean(event)
[1] -1.248
$1 1-Off lets you win if some (*of your digits are off by 1 (9 and 0 are considered to be one off from each other). If you get the number exactly correct, you win $300. If you have one digit off by 1, you win $29. If you have two digits off by 1, you win $4, and if all three of your digits are off by 1 you win $9.
This case is a combination of the previous cases. So, let’s calculate the expected value:
E[X] = (300 - 1) \cdot \frac{1}{1000} + (29 - 1) \cdot \frac{6}{1000} +
(4 - 1) \cdot \frac{6}{1000} + (9 - 1) \cdot \frac{2}{1000} - 1 \cdot \left(1 - \frac{15}{1000}\right) = -0.484
We can simulate this in R as follow:
<- replicate(100000, {
event <- sample(x = 0:9, size = 3, replace = TRUE)
my_bet <- sample(x = 0:9, size = 3, replace = TRUE)
lottery
<- abs(my_bet - lottery)
diff
if(identical(my_bet, lottery)) 299
else if(sum(diff) == 1) 28
else if(all(diff <= 1) && sum(diff) == 2) 3
else if(all(diff <= 1) && sum(diff) == 3) 8
else -1
})
mean(event)
[1] -0.38679
Let k be a positive integer and let X be a random variable with pmf given by p(x) = 1/k for x=1,...,k and p(x) = 0 for all other values of x. Find E[X]
We can use the definition of the expected value:
E[X] = \sum_{x=1}^k x \cdot \frac{1}{k} = \frac{1}{k} \sum_{x=1}^k x
E[X] = \frac{1}{k} \cdot \frac{k(k+1)}{2} = \frac{k+1}{2}
Suppose you take a 20-question multiple choice test, where each question has four choices. You guess randomly on each question.
This is a Binomial Random Variable with n=20 and p=0.25. So, the expected value is:
E[X] = np = 20 \cdot 0.25 = 5
We can simulate this in R as follows:
<- replicate(100000, {
event <- sample(x = 0:3, size = 20, replace = TRUE)
my_answers <- sample(x = 0:3, size = 20, replace = TRUE)
correct_answers sum(my_answers == correct_answers)
})
mean(event)
[1] 4.99481
But also we can use the built in function rbinom
:
mean(rbinom(100000, size = 20, prob = 0.25))
[1] 4.99401
The probability of getting 10 or more questions correct is:
P(X \geq 10) = 1 - P(X < 10) = 1 - \sum_{x=0}^{9} \binom{20}{x} \cdot 0.25^x \cdot 0.75^{20-x}
The calcs are quite tedious for this one, so lets go stright to the code:
1 - sum(dbinom(0:9, size = 20, prob = 0.25))
[1] 0.01386442