# How likely is the NSA PRISM program to catch a terrorist?

Recent revelations about PRISM, the NSA’s massive program of surveillance of civilian communications have caused quite a stir. And rightfully so, as it appears that the agency has been granted warrantless direct access to just about any form of digital communication engaged in by American citizens, and that their access to such data has been growing significantly over the past few years.

Some may argue that there is a necessary trade-off between civil liberties and public safety, and that others should just quit their whining. Lets take a look at this proposition (not the whining part). Specifically, let’s ask: how much benefit, in terms of thwarted would-be attacks, does this level of surveillance confer?

Lets start by recognizing that terrorism is extremely rare. So the probability that an individual under surveillance (and now everyone is under surveillance) is also a terrorist is also extremely low. Lets also assume that the neck-beards at the NSA are fairly clever, if exceptionally creepy. We assume that they have devised an algorithm that can detect ‘terrorist communications’ (as opposed to, for instance, pizza orders) with 99% accuracy.

P(+ |  bad guy) = 0.99

A job well done, and Murica lives to fight another day. Well, not quite. What we really want to know is: what is the probability that they’ve found a bad guy, given that they’ve gotten a hit on their screen? Or,

Which is quite a different question altogether. To figure this out, we need a bit more information. Recall that bad guys (specifically terrorists) are extremely rare, say on the order of one in a million (this is a wild over estimate with the true rate being much lower, of course – but lets not let that stop us). So,

Further, lets say that the spooks have a pretty good algorithm that only comes up falsely positive (ie when the person under surveillance is a good guy) one in one hundred times.

P(+ |  good guy) = 0.01

And now we have all that we need. Apply a little special Bayes sauce:

and we get:

P(bad guy | +) = 1/10,102

That is, for every positive (the NSA calls these ‘reports’) there is only a 1 in 10,102 chance (using our rough assumptions) that they’ve found a real bad guy.

UPDATE: While former NSA analyst turned whistle blower William Binney thinks this is a plausible estimate, the point here is not that this is the ‘correct probability‘ involved (remember that we based our calculations on very rough assumptions). The take away message is simply that whenever the rate of an event of interest is extremely low, even a very accurate test will fail very often.

UPDATE 2: The Wall Street Journal’s Numbers Guy has written a piece on this in which several statisticians and security experts respond.

UPDATE 3: If you can read German, a reader reached me to point out that Der Spiegel technology section picked up the story.

Big brother is always watching, but he’s still got a needle in a haystack problem.

The television series doesn’t have this problem. On the show, they’re all bad guys.

## 57 thoughts on “How likely is the NSA PRISM program to catch a terrorist?”

1. I’m confused about how it works… but it’s cost (\$20m) doesn’t suggest it is a system for surveilling all calls and “detecting” terrorists in real time. It sounds more like a system for monitoring already identified suspects and friends (plus friends of friends???) via direct access but bypassing any judicial oversight.

2. Andrew says:

I find this whole article immensely wrong. Every probability you’ve plucked out of thin air could easily be at least ten times larger or smaller so even conservatively your eventual P(bad guy | +) could be anywhere from 1/10 to 1/10^7. The problem is you simply don’t know any of your inputs; hand-waving some numbers, screaming ‘MATHS’ and then announcing an answer correct to 5 significant figures is just ludicrous.

• Tom says:

Exactly. Without expressing an opinion on the program that this blog comments on, I find this type of piece an embarrassing misuse of statistical inference. Step 1: Make up data. Step 2: Analyze data using some statistical technique. Step 3: Draw conclusions from analysis with no measure of error. Science owes the public better than this.

• I was doing the calculation using slightly different inputs, and as you said the results vary wildly. As you so rightly point out, the above calculation is more than anything an example of using math to falsely demonstrate an accuracy where none exists. Where are the error bars?

• Clayton says:

Let me make a different calculation.

But P(+) = P(+ | bad guy) P(bad guy) + P(- | good guy) P(good guy)

So you end up with:

P(bad guy | +) = 1 / (1 + P(- | good guy)/P(+ | bad guy) P(good guy) / P(bad guy))

If P(bad guy) is small, as we expect it to be, then P(bad guy | + ) can be very well approximated by:

P(bad guy | + ) = P(bad guy) / R (+ corrections of the order of P(bad guy) squared)

where R = P(+ | bad guy)/P(- | good guy)

If the method is good enough, it will have an R ~ 1, that is, it will be more or less as good to detect bad guys as positives it is to detect good guys as negatives. This means that the probability of a positive result turning up to be a bad guy is of the same order of magnitude of the probability of pointing at random and finding a bad guy.

• Andrew says:

Clayton: there are several problems with what you have written.

Firstly your formula for P(+) is incorrect, it should be:

P(+) = P(+ | bad guy) P(bad guy) + P(+ | good guy) P(good guy)

Secondly (correctly for your plus/minus mistake) your formula for R is inverted, it should be:

R = P(+ | good guy)/P(+ | bad guy)

I won’t write out the algebra here but it should be obvious that your R cannot be correct – substituting your R into P(bad guy | + ) would give:

P(bad guy | + ) = P(bad guy) / (P(+ | bad guy)/P(- | good guy)) = P(bad guy) * P(- | good guy) / P(+ | bad guy)

i.e. that P(bad guy | +) is proportional to P(- | good guy) which can’t be correct as it would mean that reducing the rate of false positives would *reduce* the effectiveness of the program.

So actually what we have is:

P(bad guy | + ) = P(bad guy) * P(+ | bad guy) / P(+ | good guy)

If we make the assumption that P(+ | bad guy) is high (I’m not saying it is, but that’s what the original article assumes) then we can estimate this as:

P(bad guy | + ) = P(bad guy) / P(+ | good guy)

i.e. the more you can reduce your false positives the higher P(bad guy | +) will be.

In combination then, your statement that “the probability of a positive result turning up to be a bad guy is of the same order of magnitude of the probability of pointing at random and finding a bad guy” is demonstrably false – it is hugely dependent upon your rate of false positive and as such can be orders of magnitude different from finding a bad guy at random.

• Linda M says:

You’re missing the point. Replace the chosen probability with any other one and see how the false positive rate changes… it doesn’t, not meaningfully. Any test will be bound by the low number of ‘bad guys’ and the very high population numbers being monitored.

You need tests with astonishing accuracy—exactly what you don’t have, when you’re still investigating—in order to make total surveillance worth it.

3. Are you sure that’s how it is being used? I doubt the data they have could be used for predictive mining of the sort you describe.

It sounds more like they are just searching for connections to already identified suspects, locations, times. Graph analysis – friends, friends of friends, cliques etc.. A big big data problem but not insoluble.

4. Nick says:

Hi, there! Suppose P(bad guy) = 1/1,000 as we can see in Iraq and Afganistan? Not bad but very upset with blind killing drones in the blue sky. In this case P(bad guy | +) = 0.09!

5. Neat, but this does somewhat assume that the NSA has nothing _but_ mobile phone data to work with. Let’s pretend that they have other data sources for a moment; and that they’re quite good statisticians into the bargain.

6. donald says:

I was reading an article about the Cleveland Browns on draft day by Chuck Klosterman and this quote came up: “I’ve never witnessed this level of institutional paranoia within a universe so devoid of actual secrets. I don’t even know what they don’t want me to know.” I has sounded like just about every time I’ve heard a good assessment of what the NSA and CIA know, especially about more socially diffuse events like terrorism.

7. donald says:

One other thing, If people are interested in a deeper history in all of this, read up on echelon, the father to a lot of this.

8. — That is, for every positive (the NSA calls these ‘reports’) there is only a 1 in 10,102 chance (using our rough assumptions) that they’ve found a real bad guy.

Chill out folks. The point of the exercise is to show, using Bayes, how expensive it will be to be “secure”. Until NSA (or any of the spook agencies) release the actual numbers, it is all guess work. OTOH, we can “estimate” the effort needed to achieve each of our’s required level of “security”. And, concurrently, how willing we are to permit collateral damage. Kind of like Salem in the late 17th century. Can never be too careful when it comes to witches.

9. Doug says:

All of you “mathmeticians” complaining of the accuracy of these calculations are totally missing the point and allowing yourselves to be distracted from the real issue. This article simply points out that the very concept of this program is faulty and an inexcusable intrusion of privacy and a laugh in the face of our constitution. This is not about math but about the preservation of the very things our country was founded on. Stop criticizing his calculations and focus your ire on those responsible for this crime against all Americans.

• Exactly! The probability of success has nothing to do with anything other than effectiveness at some cost. To collect information on the BS Terrorism “Threat” is just an excuse by the control freaks that run government to creep further into your privacy, bit by bit. To hear them talk about it, there is a terrorist around every corner. Control Freaks are paranoid that they will lose control of anything down to their tooth brush. They worry about controlling their toilet paper as they wipe! That is why they want to control everyone else. A bunch of fraidy cats who think they are important. That’s a good definition of a politician.

• Andrew says:

Doug – as one of the “mathematicians” you reference I must apologise; you are of course right that questioning whether statistics have been applied correctly on a blog about statistics was wrong of me, and that rather than trying to gain an understanding of whether the PRISM program could be effective I should instead just be uncritically accepting what the media and the internet tells me.

The problem with uncritically accepting the above calculation is that it will be quoted and repeated around the media; do a Google search for “prism 10102″ and there are already a few references to it:

“A biologist who specializes in statistics has calculated that when the NSA’s alleged broad dragnet of Facebook, Google and other sites turns up a potential terrorist, there’s only a 1 in 10,102 chance that he or she is an actual terrorist.”

That particular article concedes that the result is based on rough assumptions, but how long will it be until this is requoted enough that it just becomes a “fact” that the media will parrot without context. The media is bad enough already, we shouldn’t be encouraging them by giving them the opportunity to say “a scientist has calculated there’s only a one in ten thousand chance of PRISM working” when in fact that number is pulled out of thin air.

I believe that PRISM *is* a bad thing, that it’s probably illegal and that it is almost certainly be ineffective. None of that means that I shouldn’t take the time to assess and understand whether what I am being told is accurate or not.

• Citzen says:

Well said. Do you think however despite the number in the article being wrong, the associated probability is almost certain to be lower then 1 in 10102?

• Citzen says:

*than

• Andrew says:

Citizen: ‘Do you think however despite the number in the article being wrong, the associated probability is almost certain to be lower then 1 in 10102?’

It’s difficult to say. We can remove some of the uncertainty though by effectively taking PRISM out of the equation. Let’s suppose that PRISM is completely useless and just flags everyone with a 50% probability. i.e. we have P(+ | bad guy) = 0.5 and P(+ | good guy) = 0.5. So the Bayes equation above collapses to P(bad guy | +) = P(bad guy) so it would just come down to how many ‘bad guys’ there are out there.

So how many bad guys are there? A little Googling gives some numbers: Figure 7 of http://tcths.sanford.duke.edu/documents/Kurzman_Muslim-American_Terrorism_final2013.pdf suggests that the number of *Muslim* terrosism suspects and perpetrators is on the order of 1 in 100,000. Now, of all the ‘terrorists’ out there in the US, what proportion are Muslims? There is a table here http://www.fbi.gov/stats-services/publications/terrorism-2002-2005/terror02_05#terror_05sum that summarises terrorist attacks from 1980-2005, the site that I found it through calculates that (unsurprisingly) less than 10% of the attacks were by Muslims. So our number of potential terrorist suspects might be on the order of ten times bigger, i.e. 1 in 10,000.

Even if the NSA didn’t have PRISM then, and just picked 10,000 names out of the phone book we’d expect 1 of them to be of interest. So even without PRISM the probability is slightly *higher* than 1 in 10,102.

This obviously all depends on how accurate the data in those articles is, but given that it is derived from government sources it at least suggests that the *government* might put a lower bound on PRISM’s effectiveness at 1 in 10,000 (remember that my analysis assumes that PRISM does nothing useful, if it did do something useful the probability should increase).

I’m not saying that means *we* should have faith in it (the government’s idea of a suspect might be very different to ours), but it the government were trying to estimate the effectiveness of the system it might be a number they would come up with.

10. Richard Johnson says:

If DHS were to round up people at random and interrogate them at length, and assuming that about one in ten thousand people is a terrorist, and assuming they were then 100% correct in identifying and ferreting out terrorists using these techniques, then how many random people would need to be scooped up to be 95% certain they caught at least one terrorist?

Just curious. I’m not a statistician, but I’ll bet it’s a lot more than ten thousand people.

The real issue is not that big brother watches, but that big brother also has long arms.

• Andrew says:

Your question is equivalent to asking ‘at what point would the probability of all the people questioned being a non-terrorist’ fall below 0.05%’. So we want the smallest value of n such that (1-1/10000)^n < 0.05. Taking logs of both sides gives:

n * log(0.9999) < log(0.05)

The smallest (integral) n which satisfies this is n = 29,956.

If the frequency of terrorists is 1 in x, where x is large, then it will generally be the case that to be 95% confident of finding a terrorist you'd need to question approximately 3 * x people.

11. Eric Eldred says:

For any “probe” (as they call it) of the big data, there are, as you say, false positives. But also false negatives. In order to determine the real numbers of positives and negatives, both true and false, it is necessary to determine, independently somehow, the actual number of persons with the condition in the population, and the total population under study. Under Bayes rules you make a guess, but you can refine that guess depending on what happens. Then you study the test under real conditions to determine just how effective it is, so the number of false positives and negatives is acceptable, and the cost is competitive. None of this has been revealed by the NSA, the White House, Congress, or anybody.

This can only be a guess, since the number of potential terrorists is completely uncertain (terrorist as labeled by whom?), and the idea is to prevent behavior instead of pin a crime on somebody after the fact. Once a suicide bomber acts, it doesn’t seem that this sort of test would be significant. Since the terrorists are constantly changing tactics in response to the counter-terrorists, analysis of previous big data is inappropriate or little effective to discover innovative patterns.

Any such test would have to be biased by the tendency by users (or the ones selling the test to the government snoopers) to inflate the risk and minimize the side effects of the test. The result would be to inflate the results of just a few snooping acts caught as actual terrorism and ignore all the false positives and false negatives. As Bill Keller wrote in The New York Times, Snowden did not reveal any actual abuse. (Keller ignored the chilling effect to the First Amendment as well as the Fourth.)

In that sense it is very like the risk of greenhouse warming, or screening for breast or prostate cancer, the numbers are constantly misused and lead to tragic results.

If averting terrorist acts by monitoring all communications were evaluated by Cass Sunstein’s cost-effectiveness metrics, as he used them while at the White House, they would not pass. NSA budget is about \$10 billion. Monitoring did not prevent the three most recent deaths in Boston. What is the argument that conventional police work cannot do the job in the most cost-effective way? Why can’t the communications companies store data and reveal it only under a particular warrant, not a general one?

Recently some TSA agents in Boston revealed that the expensive profiling operation to search potential terrorists on suspicious behavior grounds was a complete failure, and that their bosses had instructed them to go after illegal immigrants or drug smugglers instead, as then they could show results. One can expect mission-creep here in monitoring communications, as for example Eliot Spitzer’s money transactions, that never resulted in any charges of illegality.

Once you start storing all communication and then digging through it for dirt you will find dirt. Whether this is worth it or not should depend on rational decisions based on cost-effectiveness. Promoting fear and denying transparency of data can prevent not only terrorism but a responsible, rational democracy.

12. 16941694 says: