EECS 391/491: PROBLEM SET #10 FAQ

Overall

This problem set has a few small programs; start early. You can complete Problems 10.1, 10.2, and 10.6 as of Lecture #21. You can probably do 10.3 after doing just a little reading in the book. If you wish to start 10.5 and 10.7 early, I have placed an example problem below.

Problem 10.2

The answers to (c) and (d) are quite inter-related. Depending on how you approach this problem, you might have an answer to (d) before you answer (c). That is fine, of course.

Problem 10.3

For this problem, you are to assume step function activation functions. You can place the threshold number inside your neuron. Using this, for example, the "AND" net in Figure 20.17 of R&N could be drawn simply as follows:
       1    |---------|
 x1 ------->|         |
            |   1.5   |------->
 x2 ------->|         |
       1    |---------|

Problem 10.4

See Figure 20.27 of R&N for an example.

Problem 10.5

Again, you are to assume that the activation function is a step function. This means that in Equation (20.12) of R&N, the g'(in) term is removed. (See footnote 10 on page 742 of R&N.)

Use the convention x0=-1, as we did in lecture and as in R&N. Starting "with initial values of all weights ... zero " means that the weight vector should contain 4 zeros initially.

For the drawing, I am basically looking for a three-dimensional version of the plots in Figure 20.20 (a) and (b) of R&N.

The Perceptron Learning Rule was given in Lecture #22:


W <- W + (desiredOutput - netOutput) X    (EQN 1)

where
* W is the (augmented) weight vector,
* X is the (augmented) instance data vector,
* netOutput = ( sum_{i=0}^n w_i x_i ) > 0 ? 1 : 0

Here is an example we started in lecture:

Learning the NOT function
=========================

x1 desiredOutput
-- -------------
0        1
1        0

Create the augmented data set (with w0 representing the threshold,
and x0 set to be -1 for all data instance vectors)

    x0 x1 desiredOutput
    -- -- -------------
X1  -1 0        1
X2  -1 1        0

Start with an augmented weight vector of all 0's: W = [0 0]

Apply (EQN 1) repeatedly using the data instance vectors X1 and X2

W = [  0  0 ]

netOutput = ( W.X1 > 0 ) ? 1: 0 
          = ( 0*(-1)+0*(0) > 0 ) ? 1: 0 
          = ( 0 > 0 ) ? 1 : 0
          = 0

W = [  0  0 ] + (1 - 0) X1 
  = [  0  0 ] + (1) [ -1  0 ]
  = [  0  0 ] + [ -1  0 ]
  = [ -1  0 ]

netOutput = ( W.X2 > 0 ) ? 1: 0 = ( (-1)*(-1)+0*(1) > 0 ) ? 1: 0 = 1

W = [ -1  0 ] + (0 - 1) X2 
  = [ -1  0 ] + (-1) [ -1  1 ]
  = [ -1  0 ] + [  1 -1 ]
  = [  0 -1 ]

netOutput = ( W.X1 > 0 ) ? 1: 0 = ( (0)*(-1)+(-1)*(0) > 0 ) ? 1: 0 = 0

W = [  0 -1 ] + (1 - 0) [ -1  0 ]
  = [ -1 -1 ] 

Continuing, more briefly

W = [ -1 -1 ] + (0 - 0) [ -1  1 ]

W = [ -1 -1 ] + (1 - 1) [ -1  1 ]

We have gone through all the data instances (completed one "epoch")
with no changes in W, so it has converged.

Problem 10.6

Programming in, say Excel, would be one fine way to solve this problem. But you can use any approach. It is simple in C++ and Matlab, e.g.

Problem 10.7

As in Problem 10.5 above, remember that the perceptron learning rule requires that g'(in) be omitted from the update equation.

The notation concerning ``transpose'' simply means that you should write the weight vector as a row vector (even though you are probably more familiar with writing vectors of numbers as column vectors).


Author: Michael S. Branicky.