expLetter

Share this post
BRML Problems
expletter.substack.com

BRML Problems

Bayesian Reasoning & Machine Learning

May 10, 2021
Share this post
BRML Problems
expletter.substack.com

I’ve been picking up and working through some of the exercises in Bayesian Reasoning and Machine Learning — a book I’ve been finding extremely readable and enjoyable to work through.

The official solutions are restricted to instructors, which makes it a little hard to confirm if I’m going down on the right path or not; there are some hits on google when searching for solutions — but I’m a little skeptical of the correctness of some of them. I’ll try and capture some of my solutions in this post for the next person working through the book by themselves.

(The book is available online here.)

Exercise 1.5

Exercise 1.5 (Adapted from David S. understandinguncertainty.org). A secret government agency has developed a scanner which determines whether a person is a terrorist. The scanner is fairly reliable; 95% of all scanned terrorists are identified as terrorists, and 95% of all upstanding citizens are identified as such. An informant tells the agency that exactly one passenger of 100 aboard an aeroplane in which you are seated is a terrorist. The police haul off the plane the first person for which the scanner tests positive. What is the probability that this person is a terrorist?

This was a tricky one: there’s a description of a solution from the original source of the problem at http://understandinguncertainty.org/dishonesty, except that this variation has an additional twist — it’s not any person who triggers the scanner, but the first person to trigger the scanner. You can see a Twitter thread where we flail around wondering about a solution here and my final solution here.

A notebook plotting out the derived solution is here. The derivation is basically based on Baye’s theorem — to have a general solution, defining

  • a = probability that a terrorist will set off the scanner (.95 in the problem)

  • b = probability that a citizen doesn’t set off the scanner (.95 again)

  • k = number of passengers in the plane

Then, expressing the problem in terms of the person who set off the scanner — which can be evaluated as a marginal solution over all possible positions of x from 1 to k.

P(x is a Terrorist | x is the first person to set off the scanner) = P(x is the first person to set off the scanner | x is a terrorist) * P(x is a terrorist) / P(x is the first person to set off the scanner)

Here,

  • P(x is a terrorist) over all possible passengers in the flight = Sum from 1 -> k where the probability of there being a terrorist is 1/k; which turns out to be 1. (Our prior is that exactly 1 of the k passengers is a terrorist)

  • P(x is a terrorist | x is the first person to set off the scanner)

    • = Sum i from 1 -> k over P(No one from 1 -> i - 1 triggered the scanner | x at point i is the terrorist) * P(x triggers the scanner | x at point i is the terrorist)

    • = Sum (a * b^(i - 1))

    • = a * (1 - b^(k - 1)) / (1 - b) (using the sum of a geometric series)

  • P (x is the first person to set off the scanner) over all possible positions of x

    • = P (anyone sets off the scanner)

    • = 1 - P (no-one sets off the scanner)

    • = 1 - (1 - a) * b ^ (k - 1)

Putting it all together, the answer I get is

P(x is a terrorist | x is the first person to set off the scanner)

  • = a * (1 - b ^(k - 1)) / (1 - b) / (1 - (1 - a) * b ^ (k - 1))

  • = 0.1889 plugging in a = .95, b = .95, k = 100

I enjoyed exploring this problem a lot; some of my primary takeaways include:

  • Nothing quite beats a simulation to get answers that I can trust; my intuition is pretty faulty in general.

  • Even with 95% odds of catching the right person, the scanner will only catch them <20% of the time. Improving the odds of catching the right person don’t help improve this much.

    Image
  • Increasing the number of people in the aeroplane dramatically reduces the odds of successfully catching a terrorist.

  • Reducing the chances that the scanner goes off for a normal person gives the most return for investment just because k >> 1. The derived solution is roughly proportional to 1 / probability of triggering the scanner for a normal person.

  • This doubles down on the fact that even very high quality detection for very rare events means most likely instances of a detection are likely to be false-positives. Reducing the odds of a false-positive seems to have the greatest possible impact.

Share this post
BRML Problems
expletter.substack.com
Comments

Create your profile

0 subscriptions will be displayed on your profile (edit)

Skip for now

Only paid subscribers can comment on this post

Already a paid subscriber? Sign in

Check your email

For your security, we need to re-authenticate you.

Click the link we sent to , or click here to sign in.

TopNew

No posts

Ready for more?

© 2022 Kunal
Privacy ∙ Terms ∙ Collection notice
Publish on Substack Get the app
Substack is the home for great writing