Learning Center
Plans & pricing Sign in
Sign Out




Computer Science, CS36110: Intelligent Learning

Time allowed: 2 hours

Calculators are permitted, provided they are silent, self-powered, without

communications facilities, and incapable of holding text or other material that

could be used to give the candidate an unfair advantage. They must be made

available on request for inspection by invigilators, who are authorised to remove

any suspect calculators.

Answer THREE from FOUR questions.

All questions carry equal marks.

1. This question is about the introduction of the module and concept learning.
   a) Do you think that it is possible for a machine to learn? Justify your
      answer.                                                               [8 marks]
   b) State the three elements for the denition of a learning problem and give
      an example of a learning problem to instantiate these elements.             [6]
   c) Suppose you are a doctor treating a patient who occasionally suers from
      an allergic reaction. Your intuition tells you that the allergy is a direct
      consequence of certain foods the patient eats, places where the patient
      eats, time of day, day of week and the amount spent on food. So you
      gathered some data and the data is summarized in the table below.
       Number     Restaurant    Meal        Day         Cost        Reaction
       1          Sam's         breakfast   Friday      cheap       yes
       2          Hilton        lunch       Friday      expensive   no
       3          Sam's         lunch       Saturday    cheap       yes
       4          Denny's       breakfast   Sunday      cheap       no
       5          Sam's         breakfast   Sunday      expensive   no
      Use the notation [?, ?, ?, ?] for the most general model, where the \?"
      indicates a \don't care" condition (i.e., any attribute value in the
      corresponding position is allowed).
      (i) What is the hypothesis that ts this data using the Find-S algorithm
           described in the lectures. Show a trace of the steps that led you to the
           result. Make sure that all the steps are explained.                   [8]
      (ii) Use the learnt hypothesis to predict the reaction of the new instance:
           [Sam's, lunch, Friday, cheap]. Explain how you reach that conclusion.
   d) Explain why it is necessary to introduce inductive bias for learning.      [8]

                                    Page 1 of 3                          Turn over
2. This question is about genetic algorithms and reinforcement learning.

   a) This question is about the simulation of articial aquatic life introduced in
      (i) What is 
ocking behavior?                                              [4]
      (ii) The 
ocking behavior is governed by three rules. Explain what these
            rules are.                                                           [9]
      (iii) List 6 parameters that should be considered for the simulation of the
            articial aquatic life. Explain what these parameters describe.      [6]
      (iv) How do you encode the parameters given in part 2. a) (iii) above?
            What are the corresponding mutation operators for the
            implementation of a genetic algorithm?                               [8]
   b) Compare and discuss the main dierences between genetic algorithms and
      reinforcement learning.                                                    [6]

3. a) Attributes can have binary values. Name three other types of value.        [3]
   b) What is class noise?                                                       [2]
   c) (i) You have been given a dataset of 10,000 training examples which is
           guaranteed to have no noise in it. You are asked to learn to predict
           this data, and your prediction method will be tested on a dataset of
           10,000 test examples. How does the lack of noise aect you approach
           to the learning problem?                                              [5]
      (ii) What do you think the accuracy of your predictions in the test
           examples will depend on?                                              [6]
   d) (i) You are given a dataset of 10,000,000 training examples that are
           described using 10 binary attributes and the class of each example is
           either positive or negative. The probability of each attribute value is
           close to 0.5. It is suggested by your boss that you should use 10-fold
           cross validation to estimate prediction error. Comment.               [7]
      (ii) On the the rst of the 10 cross-validation runs you get a sample error
           of 0.1 on the training data. What are the 95% condence limits on the
           test data?                                                            [4]
   e) Consider a problem of 
ower identication. Example 
owers are described
      using 4 attributes: petal colour (white, red), sepal colour (green, blue,
      red), no of petals (4,5,6,7), no of sepals (4,5,6). Hypotheses are
      represented by conjunctions of constraints on the attributes. An attribute
      constraint may be \?" (any value), an attribute value (e.g.
      no of petals=4), or \@" (no value).
      (i) What is the most general hypothesis?                                   [3]
      (ii) How large is the hypothesis space?                                    [3]

                                   Page 2 of 3                          Turn over
4. a) Consider the following case: there are 2 classes with probabilities
                       (P(class=c1) = 0.4, P(class=c2) = 0.6),
      a colour attribute with probabilities
          (P(colour=redjclass=c1) = 0.1, P(colour=redjclass=c2) = 0.2),
      and a size attribute with probabilities
           (P(size=largejclass=c1) = 0.5, P(size=largejclass=c2) = 0.2).
      You are presented with an example where colour=red and size=large.
      (i) Is class c1 or c2 more probable for the classier?                       [4]
      (ii) What class would be more probable if you did not have the a priori
            class probabilities?                                                   [2]
   b) (i) Describe the naive Bayes learning method.                                [8]
      (ii) Under what circumstances is the naive Bayes classication method the
            maximum a posteriori hypothesis.                                       [2]
      (iii) Brie
y describe what the Bayes optimal classier is.                   [4]
      (iv) Draw the Bayesian belief network that represents the assumptions of
            the naive Bayes classier for data consisting of three attributes (colour,
            size, age) and a class determined by the attributes.                   [4]
   c) In 1997 The UK Appeal Court stated that the laws of probability are: \a
      recipe for confusion, misunderstanding and misjudgement". Instead, juries
      should rely on their \individual common sense and knowledge of the
      world". From your knowledge of probability theory, as well as your
      common sense and knowledge of the world, argue the case for juries using
      the laws of probability and Bayes theorem.                                   [9]

                                    Page 3 of 3

To top