Introduction to probabilities Conditional probabilities Bayes'theorem by variablepitch345


									Probabilities in IR
Introduction to probabilities Conditional probabilities Bayes’ theorem Bayes’ theorem in IR

Playing “Hangman”: _ _ _ _ _ _ _
What letters should you try first ?

The TV Show / Olympic game
What is the chance of a winning hand ?

Probability – the likelihood that something will happen
Probability of a favorable outcome = number of favorable outcomes / number of possible outcomes

Range: [0, 1]

More examples
Deck of cards
P(spade) = 13/52 = ¼ P(king of spade) = 1/52

One fair coin
P(H) = P(T) = ½

One fair coin is thrown 3 times. What are the chances of throwing exactly 2 heads ?
The sample space: Ω = { HHH, HHT, HTH, HTT, THH, THT, TTH, TTT } Favorable outcomes: A = { HHT, HTH, THH} P(A) = |A| / | Ω| = 3/8

Conditional probability
Captures partial knowledge about the outcome of an experiment
Prior probability – before getting the additional knowledge Posterior probability – after getting the additional knowledge

Ex: tossing a coin 3 times
Bet after seeing the first coin:
If it was H, the outcomes are ΩH = {HHH, HHT, HTH, HTT}, so P(2 heads | H) = 2/4 = ½ If it was T, the outcomes are ΩT = {THH, THT, TTH, TTT}, so P(2 heads | T) = ¼

Conditional probability
P(A ∩ B) = P(B) * P(A | B)

P(A | B) = P(A ∩ B) / P(B)
Ex: Anselm sees that there is someone in room 303. Should he check if Gheorghe is in to say Hello ? He’s sleepy and can’t remember the day of the week He knows the schedule for 303

He notices Cinthia entering the room. Do the chances of finding Gheorghe increase ?
He knows that, apart from Gheorghe’s class Cinthia takes another class that meets in 301.
Mon Tue Wed Thu Fri Gheorghe Cinthia * * * *

Bayes theorem
P(A|B) = P(A ∩ B) / P(B) P(B|A) = P(B ∩ A) / P(A)
but P(A ∩ B) = P(B ∩ A), so

P(A|B) P(B) = P(B|A) P(A)

P(A|B) = P(B|A) P(A) / P(B)


Predicting whether a patient is ill based on the result of a test
The test (possible outcomes: +, –) is not perfect:
It returns a correct positive in 98% of cases It returns a correct negative in 97% of cases

Prior knowledge: 0.8% incidence of the illness Question: given a positive result for a patient, what are the chances that he/she is ill ? Or: what should the diagnostic be ?

Example - solution
P(ill | +) = P(+ | ill) P(ill) / P(+)
= 0.98 * 0.008 / P(+) = 0.00784 / P(+)

P(healthy | +) = P(+ | healthy) P(healthy) / P(+)
= 0.03 * 0.992 / P(+) = 0.02976 / P(+)

P(healthy | +) >> P(ill | +)
Interpretation: the patient is more likely to be healthy than ill.

Note. It is not necessary to know everything (i.e. estimate P(+)) to draw a conclusion

Bayes’ theorem in IR
Usual use of the theorem:
P(Model | Event) = P(Event | Model) * P(Model) / P(Event)

P(RelDoc | queryTerm) = P(queryTerm | RelDoc) * P(RelDoc) / P(queryTerm)
P(queryTerm | RelDoc) and P(queryTerm) can be estimated from the statistics of the collection P(RelDoc) may be estimated by sampling or it can be viewed as a constant that can be ignored when ranking a list of documents

Extra credits
I’m playing a game: there are 3 doors and I have to guess behind which one a treasure hides The rule: I point to a door as my first tentative choice. The host of the game then “helps” me by opening one of the other doors and showing me that there’s nothing behind. I am allowed to stick with my initial choice or change my mind. Question: Should I change my mind ?
(i.e. Would my chances of success increase if I changed my mind ?)

To top