1 |---------|
x1 ------->| |
| 1.5 |------->
x2 ------->| |
1 |---------|
Use the convention x0=-1, as we did in lecture and as in R&N. Starting "with initial values of all weights ... zero " means that the weight vector should contain 4 zeros initially.
For the drawing, I am basically looking for a three-dimensional version of the plots in Figure 20.20 (a) and (b) of R&N.The Perceptron Learning Rule was given in Lecture #22:
W <- W + (desiredOutput - netOutput) X (EQN 1)
where
* W is the (augmented) weight vector,
* X is the (augmented) instance data vector,
* netOutput = ( sum_{i=0}^n w_i x_i ) > 0 ? 1 : 0
Here is an example we started in lecture:
Learning the NOT function
=========================
x1 desiredOutput
-- -------------
0 1
1 0
Create the augmented data set (with w0 representing the threshold,
and x0 set to be -1 for all data instance vectors)
x0 x1 desiredOutput
-- -- -------------
X1 -1 0 1
X2 -1 1 0
Start with an augmented weight vector of all 0's: W = [0 0]
Apply (EQN 1) repeatedly using the data instance vectors X1 and X2
W = [ 0 0 ]
netOutput = ( W.X1 > 0 ) ? 1: 0
= ( 0*(-1)+0*(0) > 0 ) ? 1: 0
= ( 0 > 0 ) ? 1 : 0
= 0
W = [ 0 0 ] + (1 - 0) X1
= [ 0 0 ] + (1) [ -1 0 ]
= [ 0 0 ] + [ -1 0 ]
= [ -1 0 ]
netOutput = ( W.X2 > 0 ) ? 1: 0 = ( (-1)*(-1)+0*(1) > 0 ) ? 1: 0 = 1
W = [ -1 0 ] + (0 - 1) X2
= [ -1 0 ] + (-1) [ -1 1 ]
= [ -1 0 ] + [ 1 -1 ]
= [ 0 -1 ]
netOutput = ( W.X1 > 0 ) ? 1: 0 = ( (0)*(-1)+(-1)*(0) > 0 ) ? 1: 0 = 0
W = [ 0 -1 ] + (1 - 0) [ -1 0 ]
= [ -1 -1 ]
Continuing, more briefly
W = [ -1 -1 ] + (0 - 0) [ -1 1 ]
W = [ -1 -1 ] + (1 - 1) [ -1 1 ]
We have gone through all the data instances (completed one "epoch")
with no changes in W, so it has converged.
The notation concerning ``transpose'' simply means that you should write the weight vector as a row vector (even though you are probably more familiar with writing vectors of numbers as column vectors).