What’s Warren Buffett’s $1 Billion Basketball Bet Worth?

A friend of mine just alerted me to a story on NPR describing a prize on offer from Warren Buffett and Quicken Loans. The prize is a billion dollars (1B USD) for correctly predicting all 63 games in the men’s Division I college basketball tournament this March. The facebook page announcing the contest puts the odds at 1:9,223,372,036,854,775,808, which they note “may vary depending upon the knowledge and skill of entrant”.

Being curious, I thought I’d see what the assumptions were that went into that number. It would make sense to start with the assumption that you don’t know a lick about college basketball and you just guess using a coin flip for every match-up. In this scenario you’re pretty bad, but you are no worse than random. If we take this assumption, we can calculate the odds as 1/(0.5)^63.  To get precision down to a whole integer I pulled out trusty bc for the heavy lifting:

$ echo "scale=50;  1/(0.5^63)" | bc
9223372036854775808.000000

Well, that was easy. So if you were to just guess randomly, your odds of winning the big prize would be those published on the contest page. We can easily calculate the expected value of entering the contest as P(win)*prize, or 9,223,372,036ths of a dollar (that’s 9 nano dollars, if you’re paying attention). You’ve literally already spent that (and then some) in opportunity cost sunk into the time you are spending thinking about this contest and reading this post (but read on, ’cause it’s fun!).

But of course, you’re cleverer than that. You know everything about college basketball – or more likely if you are reading this blog – you have a kickass predictive model that is going to up your game and get your hands into the pocket of the Oracle from Omaha.

What level of predictiveness would you need to make this bet worth while? Let’s have a look at the expected value as a function of our individual game probability of being correct.

buffet1

And if you think that you’re really good, we can look at the 0.75 to 0.85 range:

buffet2

So it’s starting to look enticing, you might even be willing to take off work for a while if you thought you could get your model up to a consistent 85% correct game predictions, giving you an expected return of ~$35,000. A recent paper found that even after observing the first 40 scoring events, the outcome of NBA games is only predictable at 80%. In order to be eligible to win, you’ve obviously got to submit your picks before the playoff games begin, but even at this herculean level of accuracy, the expected value of an entry in the contest plummets down to $785.

Those are the odds for an individual entrant, but what are the chances that Buffet and co will have to pay out? That, of course, depends on the number of entrants. Lets assume that the skill of all entrants is the same, though they all have unique models which make different predictions. In this case we can get the probability of at least one of them hitting it big. It will be the complement of no one winning. We already know the odds for a single entrant with a given level of accuracy, so we can just take the probability that each one doesn’t win, then take 1 minus that value.

buffet3

Just as we saw that the expected value is very sensitive to the predictive accuracy of the participant, so too is the probability that the prize will be awarded at all. If 1 million super talented sporting sages with 80%  game-level accuracy enter the contest, there will only be a slightly greater than 50% chance of anyone actually winning. If we substitute in a more reasonable (but let’s face it, still wildly high) figure for participants’ accuracy of 70%, the chance becomes only 1 in 5739  (0.017%) that the top prize will even be awarded even with a 1 million strong entrant pool.

tl;dr You’re not going to win, but you’re still going to play.

If you want to reproduce the numbers and plots in this post, check out this gist.

Simudidactic

auto·di·dact n.
A self-taught person.
From Greek autodidaktosself-taught : auto-auto- + didaktostaught;

+

sim·u·late v.
To create a representation or model of (a physical system or particular situation, for example).
From Latin simulre, simult-, from similislike;

=
(If you can get past the mixing of Latin and Greek roots)

sim·u·di·dactic adj.
To learn by creating a representation or model of a physical system or particular situation. Particularly, using in silico computation to understand complex systems and phenomena.

———————————————————————

This concept has been floating around in my head for a little while. I’ve written before on how I believe that simulation can be used to improve one’s understanding of just about anything, but have never had a nice shorthand for this process.

Simudidactic inquiry is the process of understanding aspects of the world by abstracting them into a computational model, then conducting experiments in this model world by changing the underlying properties and parameters. In this way, one can ask questions like:

  1. What type of observations might we make if x were true?
  2. If my model of the process is accurate, can I recapture the underlying parameters given the type of observations I can make in the real world? How often will I be wrong?
  3. Will I be able to distinguish between competing models given the observations I can make in the real world?

In addition to being able to ask these types of questions, the simudidact solidifies their understanding of the model by actually building it.

So go on, get simudidactic and learn via simulation!

simudidactic

Calculating AUC the hard way

The Area Under the Receiver Operator Curve is a commonly used metric of model performance in machine learning and many other binary classification/prediction problems. The idea is to generate a threshold independent measure of how well a model is able to distinguish between two possible outcomes. Threshold independent here just means that for any model which makes continuous predictions about binary outcomes, the conversion of the continuous predictions to binary requires making the choice of an arbitrary threshold above which will be a prediction of 1, below which will be 0.

AUC gets around this threshold problem by integrating across all possible thresholds. Typically, it is calculated by plotting the rate of false positives against false negatives across the range of possible thresholds (this is the Receiver Operator Curve)  and then integrating (calculating the area under the curve). The result is typically something like this:

auc

I’ve implemented this algorithm in an R script (https://gist.github.com/cjbayesian/6921118) which I use quite frequently. Whenever I am tasked with explaining the meaning of the AUC value however, I will usually just say that you want it to be 1 and that 0.5 is no better than random. This usually suffices, but if my interlocutor is of the particularly curious sort they will tend to want more. At which point I will offer the interpretation that the AUC gives you the probability that a randomly selected positive case (1) will be ranked higher in your predictions than a randomly selected negative case (0).

Which got me thinking – if this is true, why bother with all this false positive, false negative, ROC business in the first place? Why not just use Monte Carlo to estimate this probability directly?

So, of course, I did just that and by golly it works.

source("http://polaris.biol.mcgill.ca/AUC.R")
bs<-function(p)
{
 U<-runif(length(p),0,1)
 outcomes<-U<p
 return(outcomes)
}

# Simulate some binary outcomes #
n <- 100
x <- runif(n,-3,3)
p <- 1/(1+exp(-x))
y <- bs(p)

# Using my overly verbose code at https://gist.github.com/cjbayesian/6921118
AUC(d=y,pred=p,res=500,plot=TRUE)

## The hard way (but with fewer lines of code) ##
N <- 10000000
r_pos_p <- sample(p[y==1],N,replace=TRUE)
r_neg_p <- sample(p[y==0],N,replace=TRUE)

# Monte Carlo probability of a randomly drawn 1 having a higher score than
# a randomly drawn 0 (AUC by definition):

rAUC <- mean(r_pos_p > r_neg_p)
print(rAUC)

By randomly sampling positive and negative cases to see how often the positives have larger predicted probability than the negatives, the AUC can be calculated without the ROC or thresholds or anything. Now, before you object that this is necessarily an approximation, I’ll stop you right there – it is.  And it is more computationally expensive too. The real value for me in this method is for my understanding of the meaning of AUC. I hope that it has helped yours too!

Time-series forecasting: Bike Accidents

About a year ago I posted this video visualization of all the reported accidents involving bicycles in Montreal between 2006 and 2010. In the process I also calculated and plotted the accident rate using a monthly moving average. The results followed a pattern that was for the most part to be expected. The rate shoots up in the spring, and declines to only a handful during the winter months.

It’s now 2013 and unfortunately our data ends in 2010. However, the pattern does seem to be quite regular (that is, exhibits annual periodicity) so I decided to have a go at forecasting the time series for the missing years. I used a seasonal decomposition of time series by LOESS to accomplish this.

You can see the code on github but here are the results. First, I looked at the four components of the decomposition:

decomp_collisions

Indeed the seasonal component is quite regular and does contain the intriguing dip in the middle of the summer that I mentioned in the first post.

seasonal_collisions

 

This figure shows just the seasonal deviation from the average rates. The peaks seem to be early July and again in late September. Before doing any seasonal aggregation I thought that the mid-summer dip may correspond with the mid-August construction holiday, however it looks now like it is a broader summer-long reprieve. It could be a population wide vacation effect.

Finally, I used an exponential smoothing model to project the accident rates into the 2011-2013 seasons.

forecast_collisions

It would be great to get the data from these years to validate the forecast, but for now lets just hope that we’re not pushing up against those upper confidence bounds.

From Whale Calls to Dark Matter: Competitive Data Science with R and Python

Back in June I gave a fun talk at Montreal Python on some of my dabbling in the competitive data science scene. The good people at Savior-fair Linux recorded the talk and have edited it all together into a pretty slick video. If you can spare twenty-minutes or so, have a look.

If you want the slides, head on over to my speakerdeck page.

whaledarkmattercover

Uncertainty matters

In a post I wrote earlier this year, I noted a sentiment expressed in The Economist about understanding and embracing uncertainty.

…recent reforms to the IPCC’s procedures will do little to change its tendency to focus on the areas where there is greater consensus, avoiding the uncertainties which, though unpalatable for scientists, are important to policy. (link)

Which I felt was contrary to the way we, as scientists, speak among ourselves about policy makers. Specifically, that it is they who fear and misunderstand the implications of uncertainty.

This is the same perception which has led to the launch today by the group Sense About Science of a publication titled Making Sense of Uncertainty: Why uncertainty is part of science.

Launching a guide to Making Sense of Uncertainty at the World Conference of Science Journalists today, researchers working in some of the most significant, cutting edge fields say that if policy makers and the public are discouraged by the existence of uncertainty, we miss out on important discussions about the development of new drugs, taking action to mitigate the impact of natural hazards, how to respond to the changing climate and to pandemic threats.

Interrogated with the question ‘But are you certain?’, they say, they have ended up sounding defensive or as though their results are not meaningful. Instead we need to embrace uncertainty, especially when trying to understand more about complex systems, and ask about operational knowledge: ‘What do we need to know to make a decision? And do we know it?’

The report seems to be in line with arguments I have made about uncertainty and decision making as they pertain to ecological research, management, and policy.

Among the contributors to the report is someone who I consider to be among the best when it comes to understanding and communicating uncertainty, David Spiegelhalter. While I haven’t made my way all the way through it yet, it looks like this report will be an informative read for both scientists and policy makers (oh ya, and journalists — can’t forget about them).

Who knows, we might be able to stop the finger pointing and work together in mutual understanding of the importance of uncertainty.

What is probabilistic truth? Part 2 – Everything is conditional

Read Part 1

When making a statement of the form “1/2 is the correct probability that this coin will land tails”, there are a few things which are left unsaid, but which are typically implied.

The statement is one about the probability of an unknown event occurring, and it would seem reasonable to write this statement using probability notation as P(toss=tails) = 0.5. And indeed many people would express it this way. However, what is missing is the state of knowledge under which this statement has been made. For instance, is the coin yet to be flipped, or is it currently rolling in a circle on the table, leaning in toward its final resting position? Perhaps the flipping device can consistently throw a coin such that it rotates exactly 5 times in the air before landing flat on the table, or we know which side is up at the start of the flip. In these latter cases, the statement of probability would be made under considerably more knowledge than the first, and would not tend to be 0.5 in these cases. An observer placing a probability of P(toss=tails) = 0.99 at the moment when the coin is circling in on its resting position, leaning heavily toward a tails up configuration, could be said to have the correct probability also. For fairness, lets say that the first observer also makes her probability statement at the same moment, but from another room where she cannot see what has happened.

How can P(toss=tails) = 0.5, and P(toss=tails) = 0.99 be simultaneously correct?

The answer is conditioning. Each of the statements were made conditional on the observer’s state of knowledge. More completely, the two statements can be rewritten as:

P(toss=tails | knowledge of observer 1) = 0.5 , and

P(toss=tails | knowledge of observer 2) = 0.99

In practice, however, we often leave out the conditional part of the notation unless it is germane to the problem at hand. However, there is no such thing as unconditional probability. In fact, Harvard professor Joe Blitzstein calls conditioning the Soul of Statistics.

In the next post in this series, we’ll start looking at how to assess the correctness of a (conditional) probability statement after having observed an outcome.

Here's a bunch of random walks -- just 'cause its neat.

Here’s a bunch of random walks — just ’cause its neat.