Document Sample

Uncertainty in AI Outline: Introduction Basic Probability Theory Probabilistic Reasoning Why should we use probability theory? Dutch Book Theorem Sources of Uncertainty Information is partial Information is not fully reliable. Representation language is inherently imprecise. Information comes from multiple sources and it is conflicting. Information is approximate Non-absolute cause-effect relationships exist Basic Probability Probability theory enables us to make rational decisions. Which mode of transportation is safer: Car or Plane? What is the probability of an accident? Basic Probability Theory An experiment has a set of potential outcomes, e.g., throw a dice The sample space of an experiment is the set of all possible outcomes, e.g., {1, 2, 3, 4, 5, 6} An event is a subset of the sample space. {2} {3, 6} even = {2, 4, 6} odd = {1, 3, 5} Probability as Relative Frequency An event has a probability. Consider a long sequence of experiments. If we look at the number of times a particular event occurs in that sequence, and compare it to the total number of experiments, we can compute a ratio. This ratio is one way of estimating the probability of the event. P(E) = (# of times E occurred)/(total # of trials) Example 100 attempts are made to swim a length in 30 secs. The swimmer succeeds on 20 occasions therefore the probability that a swimmer can complete the length in 30 secs is: 20/100 = 0.2 Failure = 1-.2 or 0.8 The experiments, the sample space and the events must be defined clearly for probability to be meaningful What is the probability of an accident? Theoretical Probability Principle of Indifference— Alternatives are always to be judged equiprobable if we have no reason to expect or prefer one over the other. Each outcome in the sample space is assigned equal probability. Example: throw a dice P({1})=P({2})= ... =P({6})=1/6 Law of Large Numbers As the number of experiments increases the relative frequency of an event more closely approximates the theoretical probability of the event. if the theoretical assumptions hold. Draw parallel lines 1 inch apart on a plane Throw a 1-inch needle on the plane P( needle crossing a line )=2/π Buffon’s Needle for Computing π number of throws 2 number of crossings Large Number Reveals Untruth in Assumptions Results of 1,000,000 throws of a die Number 1 2 3 4 5 6 Fraction .155 .159 .164 .169 .174 .179 Axioms of Probability Theory Suppose P(.) is a probability function, then 1. for any event E, 0≤P(E) ≤1. 2. P(S) = 1, where S is the sample space. 3. for any two mutually exclusive events E1 and E2, P(E1 E2) = P(E1) + P(E2) Any function that satisfies the above three axioms is a probability function. Joint Probability Let A, B be two events, the joint probability of both A and B being true is denoted by P(A, B). Example: P(spade) is the probability of the top card being a spade. P(king) is the probability of the top card being a king. P(spade, king) is the probability of the top card being both a spade and a king, i.e., the king of spade. P(king, spade)=P(spade, king) ??? Properties of Probability 1. P(E) = 1– P(E) 2. If E1 and E2 are logically equivalent, then P(E1)=P(E2). E1: Not all philosophers are more than six feet tall. E2: Some philosopher is not more that six feet tall. Then P(E1)=P(E2). 3. P(E1, E2)≤P(E1). Conditional Probability The probability of an event may change after knowing another event. The probability of A given B is denoted by P(A|B). Example P( W=space ) the probability of a randomly selected word from an English text is ‘space’ P( W=space | W’=outer) the probability of ‘space’ if the previous word is ‘outer’ Example A: the top card of a deck of poker cards is a king of spade P(A) = 1/52 However, if we know B: the top card is a king then, the probability of A given B is true is P(A|B) = 1/4. How to Compute P(A|B)? A B N(A and B) P(A|B)= = N(B) N(A and B) P(A, B) N = N(B) P(B) N P(brown|cow)= N(brown-cows) P(brown-cow) = N(cows) P(cow) Business Students Of 100 students completing a course, 20 were business major. Ten students received As in the course, and three of these were business majors., suppose A is the event that a randomly selected student got an A in the course, B is the event that a randomly selected event is a business major. What is the probability of A? What is the probability of A after knowing B is true? B A 3 20 not B 80 7 Probabilistic Reasoning Evidence What we know about a situation. What we want to conclude. P( Hypothesis | Evidence ) Hypothesis Compute Credit Card Authorization E is the data about the applicant's age, job, education, income, credit history, etc, H is the hypothesis that the credit card will provide positive return. The decision of whether to issue the credit card to the applicant is based on the probability P(H|E). Medical Diagnosis E is a set of symptoms, such as, coughing, sneezing, headache, ... H is a disorder, e.g., common cold, SARS, flu. The diagnosis problem is to find an H (disorder) such that P(H|E) is maximum. Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in antinuclear demonstrations. Please rank the following statements by their probability, using 1 for the most probable and 8 for the least probable. a. Linda is a teacher in elementary school. b. Linda works in a bookstore and takes yoga classes. c. Linda is active in the feminist movement. d. Linda is psychiatric social worker. e. Linda is a member of the League of Women Voters. f. Linda is a bank teller. g. Linda is an insurance salesperson. h. Linda is a bank teller and is active in the feminist movement. Example A patient takes a lab test and the result comes back positive. The test has a false negative rate of 2% and false positive rate of 3%. Furthermore, 0.8% of the entire population have this cancer. What is the probability of cancer if we know the test result is positive? Bayes Theorem If P(E2)>0, then P(E1|E2)=P(E2|E1)P(E1)/P(E2) This can be derived from the definition of conditional probability. The Three-Card Problem Three cards are in a hat. One is red on both sides (the red-red card). One is white on both sides (the whitewhite card). One is red on one side and white on the other (the red-white card). A single card is drawn randomly and tossed into the air. a. What is the probability that the red-red card was drawn? (RR) b. What is the probability that the drawn cards lands with a white side up? (W-up) c. What is the probability that the red-red card was not drawn, assuming that the drawn card lands with the a red side up. (not-RR|R-up) Fair Bets A bet is fair to an individual I if, according to the individual's probability assessment, the bet will break even in the long run. The following three bet are fair : Bet (a): Win $4.20 if RR; lose $2.10 otherwise. [since you believe P(RR)=1/3] Bet (b): Win $2.00 if W-up; lose $2.00 otherwise. [since you believe P(W-up)=1/2] Bet (c): Win $4.00 if R-up and not-RR; lose $4.00 if R-up and RR; neither win nor lose if not-R-up. [since you believe P(not-RR|R-up)=1/2] Dutch Book The bets that you accepted have an interesting property: No matter what card is drawn in the three-card problem, and no matter how it lands, you are guaranteed to lose money. This is called a Dutch Book Verification there are three possible outcomes 1. Some card other than red-red is drawn, and it lands with white side up. That is, W-up and not-RR 2. Some card other than red-red is drawn, and it lands with a red side up. That is, R-up and not-RR. 3. The red-red card is drawn, and it lands (of course) with a red side up. That is, R-up and RR. 1 a. –$2.10 2 –$2.10 3 +$4.20 b. c. total +$2.00 ±$0.00 –$0.10 –$2.00 +$4.00 –$0.10 –$2.00 –$4.00 –$1.80 The Dutch Book Theorem Suppose that an individual I is willing to accept any bet that is fair for I. Then a Dutch book can be made against I if and only if I's assessment of probability violates Bayesian axiomatization. Independence: Intuition Events are independent if one has nothing whatever to do with others. Therefore, for two independent events, knowing one happening does change the probability of the other event happening. one toss of coin is independent of another coin (assuming it is a regular coin). price of tea in England is independent of the result of general election in Canada. Independent or Dependent? Getting cold and getting cat-allergy Mile Per Gallon and acceleration. Size of a person’s vocabulary the person’s shoe size. Independence: Definition Events A and B are independent iff: P(A, B) = P(A) x P(B) which is equivalent to P(A|B) = P(A) and P(B|A) = P(B) when P(A, B) >0. T1: the first toss is a head. T2: the second toss is a tail. P(T2|T1) = P(T2) Conditional Independence Dependent events can become independent given certain other events. Example, Size of shoe Age Size of vocabulary Two events A, B are conditionally independent given a third event C iff P(A|B, C) = P(A|C) Conditional Independence: Definition Let E1 and E2 be two events, they are conditionally independent given E iff P(E1|E, E2)=P(E1|E), that is the probability of E1 is not changed after knowing E2, given E is true. Equivalent formulations: P(E1, E2|E)=P(E1|E) P(E2|E) P(E2|E, E1)=P(E2|E) Example: Play Tennis? Outlook sunny sunny overcast rain rain rain overcast sunny sunny rain sunny overcast overcast rain Temperature hot hot hot mild cool cool cool mild cool mild mild mild hot mild Humidity high high high high normal normal normal high normal normal normal high normal high W indy false true false false false true true false false false true true false true Class − − + + + − + − + + + + + − Predict playing tennis when <sunny, cool, high, strong> What probability should be used to make the prediction? How to compute the probability? Probabilities of Individual Attributes Given the training set, we can compute the probabilities Outlook sunny overcast rain Tempreature hot mild cool + 2/9 4/9 3/9 2/9 4/9 3/9 − 3/5 0 2/5 2/5 2/5 1/5 Humidity + − high 3/9 4/5 normal 6/9 1/5 Windy true false 3/9 3/5 6/9 2/5 P(+) = 9/14 P(−) = 5/14 Naïve Bayes Method Knowledge Base contains A set of hypotheses A set of evidences Probability of an evidence given a hypothesis A sub set of the evidences known to be present in a situation the hypothesis with the highest posterior probability: P(H|E1, E2, …, Ek). The probability itself does not matter so much. Given Find Naïve Bayes Method Assumptions Hypotheses are exhaustive and mutually exclusive Evidences are conditionally independent given a hypothesis P(E1, E2,…, Ek|H) = P(E1|H)…P(Ek|H) P(H | E1, E2,…, Ek) H1 v H2 v … v Hk ¬ (Hi ^ Hj) for any i≠j = P(E1, E2,…, Ek, H)/P(E1, E2,…, Ek) = P(E1, E2,…, Ek|H)P(H)/P(E1, E2,…, Ek) Naïve Bayes Method The goal is to find H that maximize P(H|E1, E2,…, Ek) Since P(H|E1, E2,…, Ek) = P(E1, E2,…, Ek|H)P(H)/P(E1, E2,…, Ek) and P(E1, E2,…, Ek) is the same for different hypotheses, Maximizing P(H|E1, E2,…, Ek) is equivalent to maximizing P(E1, E2,…, Ek|H)P(H)= P(E1|H)…P(Ek|H)P(H) Naïve Bayes Method Find a hypothesis that maximizes P(E1|H)…P(Ek|H)P(H) Example: Play Tennis P(+| sunny, cool, high, strong) vs. P(−| sunny, cool, high, strong) P(sunny|+)P(cool|+)P(high|+)P(strong|+)P(+) vs. P(sunny|−)P(cool|−)P(high|−)P(strong|−)P(−) Outlook sunny overcast rain Tempreature hot mild cool + 2/9 4/9 3/9 2/9 4/9 3/9 − 3/5 0 2/5 2/5 2/5 1/5 Humidity + high 3/9 normal 6/9 Windy true false − 4/5 1/5 P(+) = 9/14 P(−) = 5/14 3/9 6/9 3/5 2/5 Application: Spam Detection Spam Dear sir, We want to transfer to overseas ($ 126,000.000.00 USD) One hundred and Twenty six million United States Dollars) from a Bank in Africa, I want to ask you to quietly look for a reliable and honest person who will be capable and fit to provide either an existing …… Legitimate email Ham: for lack of better name. Hypotheses: {Spam, Ham} Evidence: a document The document is treated as a set (or bag) of words Knowledge P(Spam) The prior probability of an e-mail message being a spam. How to estimate this probability? P(w|Spam) the probability that a word is w if we know w is chosen from a spam. How to estimate this probability? Limitations of Naïve Bayesian Cannot handle hypotheses of composite hypotheses well Suppose H1 ,..., H n are independent of each other Consider a composite hypothesis H1 ^ H 2 How to compute the posterior probability P ( H1 ^ H 2 | E1 ,..., E l ) ? Using the Bayes’ Theorem P ( H1 ^ H 2 | E1 ,..., E l ) P ( E1 ,... E l | H1 ^ H 2 ) P ( H1 ^ H 2 ) P ( E1 ,... E l ) P ( E1 ,...El | H1 ^ H 2 ) lj 1 P ( E j | H1 ^ H 2 ) assuming E j are independent, given H1 ^ H 2 P( H1 ^ H 2 ) P( H1 ) P( H 2 ) because they are independent How to compute P ( E j | H1 ^ H 2 ) ? Assuming H1 ,..., H n are independent, given E1 ,..., E l ? P ( H1 ^ H 2 | E1 ,..., E l ) P ( H1 | E1 ,..., E l ) P ( H 2 | E1 ,..., E l ) but this is a very unreasonable assumption E: earth quake A: alarm set off B: burglar E and B are independent But when A is given, they are (adversely) dependent because they become competitors to explain A P(B|A, E) <<P(B|A) E explains away of A Need a better representation and a better assumption Cannot handle causal chaining Ex. A: weather of the year B: cotton production of the year C: cotton price of next year Observed: A influences C The influence is not direct (A -> B -> C) P(C|B, A) = P(C|B): instantiation of B blocks influence of A on C Summary Basics of Probability Theory Probabilistic Reasoning Experiment, sample space, events Axioms and prosperities Joint Probability Conditional Probability Bayes Theorem Dutch Book Theorem Independence and Conditional Independence Naïve Bayes Method

DOCUMENT INFO

Shared By:

Categories:

Tags:
Search Problems, search engines, Job Search, search problem, how to, search engine, search product, library systems, SEO Blog, Enterprise Search

Stats:

views: | 11 |

posted: | 11/7/2009 |

language: | English |

pages: | 45 |

OTHER DOCS BY vivi07

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.