What is probabilistic truth?

I am currently working on a validation metric for binary prediction models. That is, models which make predictions about outcomes that can take on either of two possible states (eg Dead/not dead, heads/tails, cat in picture/no cat in picture, etc.) The most commonly used metric for this class of models is AUC, which assesses the relative error rates (false positive, false negative) across the whole range of possible decision thresholds. The result is a curve that looks something like this:

auc

Where the area under the curve (the curve itself is the Receiver Operator Curve (ROC)) is some value between 0 and 1. The higher this value, the better your model is said to perform. The problem with this metric, as many authors have pointed out, is that a model can perform very well in terms of AUC, but be completely miscalibrated in terms of the actual probabilities placed on each outcome.

A model which distinguishes perfectly between positive and negative cases (AUC=1) by placing a probability of 0.01 on positive cases and 0.001 on negative cases may be very far off in terms of the actual probability of a positive case. For instance, positive cases may actually occur with probability 0.6 and negative cases with 0.2. In most real situations, our models will predict a whole range of different probabilities with a unique prediction for each data point, but the general idea remains. If your goal is simply to distinguish between cases, you may not care whether the probabilities are not correct. However, if your model is purporting to quantify risk then you very much want to know if you are placing the probabilistically true predictions on cases that are yet to be observed.

Which begs the question: What is probabilistic truth? 

This questions appears, at least at first, to be rather simple. A frequentist definition would say that the probability is correct, or true, if the predicted probability is equal to the long run outcomes.  Think of a dice rolled over and over counting the number of times a one is rolled. We would compare this frequency to our predicted probability of rolling a one (1/6 for a fair six-sided die) and would say that our predicted probability was true if this frequency matched 1/6.

But what about situations where we can’t re-run an experiment over and over again? How then would we evaluate the probabilistic truth of our predictions?

I’ll be working through this problem in a series of posts in the coming weeks. Stay tuned!

Open Data Exchange 2013, April 6. Montreal

UPDATE: The day was great! There are many people doing really amazing things with open data and it was amazing to meet them. Here are my slides from the panel talk.
Next Saturday, I’ll be sitting on a panel discussing future avenues for open data at ODX13.
From the odx13 site:

Odx13 is a mini-conference to discuss the successes and challenges of extracting value from Open Data for civic engagement, international aid transparency, scientific research, and more!

Program

Morning Session – Open Data Stories; Panel Discussions

9:00 AM    Introduction and Welcome

9:15 AM    Winning with Open Data – Panel 1

10:10 AM    Les données ouvertes en action - Panel 2 (en français)

  • Guillaume Ducharme, gestionnaire dans le réseau de la santé et membre du collectif Démocratie Ouverte
  • Sébastien Pierre, fondateur, FFunction & Montréal Ouvert
  • Josée Plamondon, co-conceptrice, ContratsNet
  • Jean-Noé Landry (l’animateur de discussion), fondateur, Montréal Ouvert et Québec Ouvert

11:05 AM    Future Avenues for Open Data – Panel 3

12:00 PM    Lunch will be provided

Afternoon Session – Digging into Data; Workshop and Lightning Talks

1:00 PM    Data Dive Intro – Exploratory Data Analysis with Trudat

1:30 PM    Data Dive

We will dive into interesting Open Data sets with experts on hand to guide us through the weeds, including data on

  • International Aid
  • Government contracts
  • Biodiversity
  • and more…

3:00 PM   Lightning Talks

4:00 PM    Present data insights

4:45 PM    Closing remarks

Montreal R User group meetup at Wajam

This Thursday (Jan 24th), 5:30pm, the good folks at Wajam are hosting a meetup of the Montreal R User Group.

The event will be at Bolidea at 4115 St Laurent, Montréal, QC. Be sure to RSVP.

From Benjamin Rollert:

This is an opportunity for people interested in R to hang out at our office, eat pizza and drink beer! We’ll also show some of the cool stuff we’ve done with R as part of live applications for our business intelligence.

Hope to see you there!

http://photos1.meetupstatic.com/photos/event/d/c/e/8/highres_101276552.jpeg

Simulating weak gravitational lensing

In the search for dark matter, I have been having mixed success. It seems that locating DM in single halo skies is a fairly straightforward problem. However, when there are more than one halo, things get quite a bit trickier.

As I have advocated many times before, including here and here, simulation can provide deep insights into many (if not all) problems. I never trust my own understanding of a complicated problem until I have simulated from a hypothesized model of that problem. Once I have a simulation in place, I can then test out all kinds of hypotheses about the system by manipulating the component parts. I think of this process as a kind of computer-assisted set of thought experiments.

So, when I was hitting a wall with the dark matter challenge, I of course turned to simulation for insights. Normally this would have been my very first step, however in this case my level of understanding of the physics involved was insufficient when I started out. After having done a bit of reading on the topic, I built a model which implements a weak lensing regime on an otherwise random background of galaxies. The model assumes an Einasto profile of dark matter mass density, with parameters A and α determining the strength of the tangential shearing caused by foreground dark matter.

A=0.2, alpha=0.5

I can then increase the strength of the lens by either increasing the mass of the dark matter, or by varying the parameters of the Einasto profile.

A=0.059, alpha=0.5

A=0.03, alpha=0.5

You can check out this visualization over a range of A values.

I can also see how two halos interact in terms of the induced tangential ellipticity profile by simulating two halos and then moving them closer to one another.

You can see the effect here. You get the idea – I can also try out any combination of configurations, shapes, and strengths of interacting halos. I can then analyse the characteristics of the resulting observable factors (in this case, galaxy location and ellipticities) in order to build better a predictive model.

Unfortunately, since this is a competition with cold hard cash on the line, I am not releasing the source for this simulation at this time. I will, however, open source the whole thing when the competition ends.