ECSE 526, Artiﬁcial Intelligence – Assignment 2
Due: March 9, 2006
This assignment is to be done individually. You may not consult any human resources other
than your TAs and instructor. Questions are welcome in the public course forums.
Programming guidelines for problem 5 are as written in assignment 1. Your program will be
written in C or C++, but you may export your data to any third-party program to plot your
results. Examples are Matlab, Gnuplot, or a spreadsheet program. Figures may instead be
Your assignment should be typed rather than handwritten. Undoubtedly the best typesetting
program for this purpose is L TEX. The equation editor in your word processor is also ﬁne.
Just insure that your equations are clearly understandable, otherwise you will lose marks (in
other words, you should not resort to plain ASCII formatting). There are ﬁve problems. Good
1. [10 points] Weight Update Rules You are trying to learn a function y of two inputs x1
and x2 . You believe that the output has some kind of gaussian dependency on the inputs,
so you would like to use the following form for your function:
y = w0 + w3 exp(−w1 x2 − w2 x2 )
Provide a (stochastic) update rule for each of the weights.
2. [5 + 7 points] Decision Trees
(a) [5 points] Draw a decision tree that represents the following expression:
y = (x1 ∨ ¬x2 ) ∧ (x2 ∨ x3 ∨ ¬x4 )
Here, y is the binary-valued class label, and x1 , x2 , x3 , x4 are the binary-valued at-
tributes for instance (x, y).
(b) [Extra Credit: 7 points] Consider a decision tree built optimally from arbitrary data
for a classiﬁcation problem with 3 classes. What is the maximum training set error
that any data could have? Explain for what data set this error would be achieved.
3. [15 points] Nearest Neighbor Consider the following data set consisting of three real-
valued attributes and a binary-valued class label:
(1, 1, 0, +), (1, 3, 2, +), (3, 3, 1, −), (3, −2, 2, −), (1, 2, 1, −),
(1, −2, 2, −), (2, 1, 1, −), (0, 1, 0, +), (3, 1, 1, +)
(a) How would a 3-nearest-neighbor algorithm classify the example (2, 2, 0)? Justify your
(b) If you wanted to use all of the data points to classify the new example, you could do
so using a distance-weighted nearest neighbor algorithm. Write down an appropriate
distance function, and indicate which class the new example would have.
4. [15 points] Penalizing large weights (see also Mitchell, exercise 4.10 and section 4.8.1).
Consider the alternative error function
J(w) = (y − hw (x))2 + γ 2
The eﬀect of the added term is to penalize large weights.
(a) Derive an update rule for the weights of a sigmoid perceptron relative to this error
function. Show the derivation and ﬁnal rule.
(b) Consider a similar update function for the multi-layer neural network. Write down
the rule for updating the weight wij of the network relative to this error function.
Hint: you do not need to do another derivation in this case.
5. [55 points] Sigmoid Neuron You will write a program to implement the gradient descent
training rule for one sigmoid neuron.
(a) [10 points] Consider a single sigmoid neuron trained using a gradient descent update.
Suppose the neuron is to learn the Boolean function y = x1 ∧ (x2 ∨ x3 ). Would you
expect it to be able to learn this function? Why or why not? Sketch the decision
surface of this problem. Draw the decision tree corresponding to this function.
(b) [20 points] Write a program to implement the gradient descent training rule for
one sigmoid neuron. Test your learning algorithm on the Boolean function y =
x1 ∧ (x2 ∨ x3 ). Provide a graph of the errors and of the weights as a function of the
number of epochs of training, or weight updates that you do. Indicate when training
should be stopped.
(c) [25 points] Apply your sigmoid neuron to the problem of classifying radar signal
returns from the ionosphere using the ionosphere database from the UCI Machine
The http URL for the complete repository is:
The ﬁle ionosphere.names contains information about the database as well as some
results using diﬀerent ML algorithms that you can refer to for comparison with your
results. Report your results in the form of graphs of the training error and weights
as in (b). Indicate when training should be stopped. Provide an estimate the true
error of your sigmoid neuron.