# Monty Hall by simulation in R

(Almost) every introductory course in probability introduces conditional probability using the famous Monte Hall problem. In a nutshell, the problem is one of deciding on a best strategy in a simple game. In the game, the contestant is asked to select one of three doors. Behind one of the doors is a great prize (free attendance to an R workshop, lets say), and there is a bum prize behind each of the other two doors. The bum prize is usually depicted to be a goat, but I don’t think that would be such a bad prize, so let’s say that behind two of the doors is a bunch of poorly collected data for you to analyse. Once the contestant makes her first selection, the host then opens one of the other two doors to reveal one of the bum prizes.

At this point, the contestant is given the choice to either stay with her original selection, or switch to the other remaining unopened door. What should she do?

If you’re reading this blog, you no doubt gleefully shouted “Switch! Switch! Daddy needs a ticket to that R workshop!”. And you also, no doubt, can prove this to be the best strategy using the logic of probability. You reason that the contestant’s chance of selecting the winning door from the onset was 1/3, giving her a 2/3 probability of being wrong. Once one door with a bum prize has been opened, the contestant is now choosing between two doors. Knowing that there was only a 1/3 chance that original selection was correct, there is now a 2/3 chance that the alternate door is the winner.

As I have mentioned in previous posts, I have found that engaging students to think through logical reasoning problems can be greatly enhanced by appealing to their desire to see it in action. To that end, I whipped up this little script to simulate repeatedly playing the Monte Hall game.

```
#####################################################
# Simulation of the Monty Hall Problem
# Demonstrates that switching gives you better odds
# than staying with your initial guess
#
# Corey Chivers, 2012
#####################################################

rm(list=ls())
monty<-function(strat='stay',N=1000,print_games=TRUE)
{
doors<-1:3 #initialize the doors behind one of which is a good prize
win<-0 #to keep track of number of wins

for(i in 1:N)
{
prize<-floor(runif(1,1,4)) #randomize which door has the good prize
guess<-floor(runif(1,1,4)) #guess a door at random

## Reveal one of the doors you didn't pick which has a bum prize
if(prize!=guess)
reveal<-doors[-c(prize,guess)]
else
reveal<-sample(doors[-c(prize,guess)],1)

## Stay with your initial guess or switch
if(strat=='switch')
select<-doors[-c(reveal,guess)]
if(strat=='stay')
select<-guess
if(strat=='random')
select<-sample(doors[-reveal],1)

if(select==prize)
{
win<-win+1
outcome<-'Winner!'
}else
outcome<-'Losser!'

if(print_games)
cat(paste('Guess: ',guess,
'\nRevealed: ',reveal,
'\nSelection: ',select,
'\nPrize door: ',prize,
'\n',outcome,'\n\n',sep=''))
}
cat(paste('Using the ',strat,' strategy, your win percentage was ',win/N*100,'%\n',sep='')) #Print the win percentage of your strategy
}
```

Students can then use the function to simulate N games under the strategies ‘stay’, ‘switch’, and ‘random’. Invoking the function using:

```
monty(strat="stay")
monty(strat="switch")
monty(strat="random")

```

Hey, look at that – it is better to switch! Feel free to use this in your own class, and let me know if you use it or adapt it in an interesting way!

Edit: Thanks to Ken for pointing out that saying that ‘switching is always better’ is not quite right (you will loose 1/3 of the time, after all), but rather that switching is a rational best strategy, given your state of knowledge.

## 16 thoughts on “Monty Hall by simulation in R”

1. This problem always gave me heartburn until I finally sat down and thought about it from first principles:

1. The prize has an equal probability = 1/3 of being in A, B, or C
2. You pick A = 1/3; therefore, B+C = 2/3
3. Monty reveals that B is the gag prize; therefore B = 0 and since 0 + 2/3 = 2/3, C must have the 2/3 probability.
4. Contrary to the intuitively obvious conviction that since the prize is either behind A or C there must be a 50-50 probability between the two, a binomial outcome does NOT require even odds; otherwise your lottery ticket, which is either a winner or a loser, would produce losses, on average, half the time, and wins the other half. We know that lotteries don’t work this way.
5. Conclusion: a 2/3 chance to win is twice as good as a 1/3 chance and you should switch if you are motivated by increasing your chance of gain; on the other hand, if it would wound you deeply to have switched and lost (2/3 is good, but not a dead cert), you should stick with A because you will regret having held the winning ticket and gambled it away more than having had a chance to trade a losing ticket in on a winning one. (This is because, after the fact, you emotionally feel that you had something which has been taken away if you switch and are wrong; whereas if you don’t switch and lose, you don’t feel that you had it yet.)

2. I’ve always struggled with the problem, even after running someone else’s code. In these situations the best option—at least for me—is writing the simulation and running it (in my case using Python). Never overestimate the benefits of ‘doing’ while learning.

3. I have to admit that before running your script, albeit knowing the correct solution, never believed in it.
In other words, in a situation like this I would have never switched because of the following reason: If your initial guess is random, your chance of picking the correct door is 1/3. How might this 1/3 of your initial guess be changed? (The other door that is still closed had the 1/3 as initial probability and should keep that.)

That is I thought of this problem to be more a kind of mathematical oddity with no real impact on the empirical world. After running your script this view has dramatically changed and I would definitely switch now. Thanks a lot and showing once again the wonderful beauty of simulations.

4. Somebody posted an R simulation piece on rosettacode.org some time age – early 2010, I believe. I added some code to generalize to N number of doors – while the game host (Monty) still reveals only one door. The goal is to empirically show that it will always pay off to make a new door choice afterwards save for the initial chosen door, regardless of the total number of doors > 2.

Here’s that piece of R simulation code (somewhat updated)

MH_sim <- function(doors=4, sim.trials=20000) {

# number of doors (default 4)
# number of simulation trials (default 20000)

if (doors<3) stop("this game needs 3 or more doors")

chooser <- function(x) { i 1) sample(i,1) else i }

p100 3) “(i.e. more than 3)”, ” ( #sim.trials =”, sim.trials, “)”,
“\nSimulation of % winning probability by first and second choices, like this:\n”,
“After Monty reveals one door, a new choice is made either among all closed\n”,
“doors or the closed doors excluding the first selected door”,
if (doors==3) “(here only one)”, “\n”);
print(c(…) * 100, digits=3) }

prize_door <- sample(1:doors, sim.trials, replace=TRUE)
first_choice <- sample(1:doors, sim.trials, replace=TRUE)

host_opens <- apply(cbind(prize_door, first_choice), 1, chooser)

new_choice_all <- apply(cbind(host_opens), 1, chooser)
new_choice_ex_1st <- apply(cbind(host_opens, first_choice), 1, chooser)

p100("By 1st choice standing" = (Pr.1st_win <- mean(first_choice == prize_door)),
"New choice all doors" = (Pr.2nd_all_win <- mean(new_choice_all == prize_door)),
"New choice ex 1st" = (Pr.2nd_ex_1st_win MH_sim()

Monty Hall game for 4 doors (i.e. more than 3) ( #sim.trials = 20000 )
Simulation of % winning probability by first and second choices, like this:
After Monty reveals one door, a new choice is made either among all closed
doors or the closed doors excluding the first selected door
By 1st choice standing New choice all doors New choice ex 1st Exclusion gain
24.8 33.4 37.6 12.6

5. Oops, that did’nt go down well with WP formatting. You’d better look up
the script piece on rosettacode.org/wiki/Monty_Hall_problem, at the end
of the R section.

6. I love the Monty Hall problem, but it took me *years* to wrap my head around it. It wasn’t until I thought of it from the perspective of Monty Hall that I got it.

I’ll have to try this out in R later today. Thanks for the post!

7. Tom2 says:

Forgive me for throwing out a question that has undoubtedly been answered a hundred times over, but wouldn’t the question be more compelling this way?

There are 5 doors with 4 goats and 1 prize. You pick 2, Monte keeps 3 and then reveals goats behind 2 of the 3 doors. You then have to choose between keeping your 2 doors or switching for his 1 door.

This would create the illusion that you have a 2/3 chance if you stay (rather than 50/50 in the original problem), where actually you have a 2/5 chance if you stay and a 3/5 chance if you switch.

8. Jim Monty says:

Imagine 100 doors. You pick one. Monty Hall then opens 98 doors to reveal booby prizes behind each of them. There are just two doors left. Monty Hall offers you the opportunity to switch to the other remaining door. What are the odds you picked the right door when you had 100 to choose from? Pretty slim. By eliminating 98 doors, Monty Hall has practically told you outright that the grand prize is behind the other remaining door. He’s given you vital new information you didn’t have before.

This is exactly what’s going on when there are only three doors.

9. Pingback: Simudidactic | bayesianbiologist

10. Nick says:

Seems like a mistake in the following – “Knowing that there was a 1/3 chance that original selection was wrong, there is now a 2/3 chance that the alternate door is the winner” – This should be – “Knowing that there was a 1/3 chance that original selection was right, there is now a 2/3 chance that the alternate door is the winner”

• Right you are! Good catch. Should be fixed now.

11. I love this problem, and it works perfectly on R, but I have some issues trying to understand the code itself, could someone please explain me why he use so many functions?, for what is any one of them? maybe a link, I also try to use the Help provided by R, but it was a little bit too difficult to understand, maybe someone here has a better scenario, I would be really thankful!!