P(L) = (54+7+3+4)/100 = 0.6800 P(B) = (54+1+7+27+3+2)/100 = 0.9400 P(G|B) = (54+1+7+27)/94 = 0.9468 P(G|~B) = 0/6 = 0.0000 P(M|B,L) = (54+3)/(54+7+3) = 0.8906 P(M|B,~L) = (1)/(1+27+2) = 0.0333 P(M|~B,L) = 0/4 = 0.0000 P(M|~B,~L) = 0/2 = 0.0000
PlayTennis_
/ | \ \_
/ | \ \_
/ | \ \
Outlook Temp Humidity Wind
P(Yes) = 9/14 P(No) = 5/14 P(Sunny | Yes) = 2/9 P(Rain | Yes) = 3/9 P(Overcast | Yes) = 4/9 P(Sunny | No) = 3/5 P(Rain | No) = 2/5 P(Overcast | No) = 0 P(Hot | Yes) = 2/9 P(Cool | Yes) = 3/9 P(Mild | Yes) = 4/9 P(Hot | No) = 2/5 P(Cool | No) = 1/5 P(Mild | No) = 2/5 [ = 1 - (2/5) - (1/5)] P(High | Yes) = 3/9 P(Normal | Yes) = 6/9 [ = (9-3)/9 ] P(High | No) = 4/5 P(Normal | No) = 1/5 P(Strong | Yes) = 3/9 P(Weak | Yes) = 6/9 P(Strong | No) = 3/5 P(Weak | No) = 2/5Because results in each section above conditioned on the same events add to one, it is also conventional to drop one of each of these.
P(Yes|Sunny,Cool,High,Strong) = k P(Yes) P(Sunny|Yes) P(Cool|Yes) P(High|Yes) P(Strong|Yes)
= k (9/14) (2/9) (3/9) (3/9) (3/9)
= k/(27*7) = k 0.00529
P(No|Sunny,Cool,High,Strong) = k P(No) P(Sunny|No) P(Cool|No) P(High|No) P(Strong|No)
= k (5/14) (3/5) (1/5) (4/5) (3/5)
= k 18/(7*125) = k 0.02057
Since the latter is larger, the naive Bayes classifier predicts PlayTennis = No.
P(No|Sunny,Cool,High,Strong) = 0.02057/(0.00529+0.02057) = 0.7954
1
x1 x2 | x1 XOR x2 x1------(1.5)
------------------- \ / \
0 0 | 0 1\ / \-1
0 1 | 1 \/ \
1 0 | 1 /\ (0.5)--->
1 1 | 0 1/ \ /
/ \ /1
x2------(0.5)
1
Many other answers are possible.
x0 x1 x2 x3 desired_output ------------- -------------- -1 1 0 0 1 -1 0 1 1 0 -1 1 1 0 1 -1 1 1 1 0 -1 0 0 1 0 -1 1 0 1 1The weight update equation is
W := W + (desired_output-net_output) X ; net_output = ( W.X > 0)?The algorithm proceeds as follows:
W = [ 0 0 0 0 ] // Initially epoch=1: -------- W = [ 0 0 0 0 ] + (1-0) [ -1 1 0 0 ] = [ -1 1 0 0 ] W = [ -1 1 0 0 ] + (0-1) [ -1 0 1 1 ] = [ 0 1 -1 -1 ] W = [ 0 1 -1 -1 ] + (1-0) [ -1 1 1 0 ] = [ -1 2 0 -1 ] W = [ -1 2 0 -1 ] + (0-1) [ -1 1 1 1 ] = [ 0 1 -1 -2 ] W = [ 0 1 -1 -2 ] + (0-0) [ -1 0 0 1 ] = [ 0 1 -1 -2 ] W = [ 0 1 -1 -2 ] + (1-0) [ -1 1 0 1 ] = [ -1 2 -1 -1 ] epoch = 2: ---------- W = [ -1 2 -1 -1 ] + (1-1) [ -1 1 0 0 ] = [ -1 2 -1 -1 ] W = [ -1 2 -1 -1 ] + (0-0) [ -1 0 1 1 ] = [ -1 2 -1 -1 ] W = [ -1 2 -1 -1 ] + (1-1) [ -1 1 1 0 ] = [ -1 2 -1 -1 ] W = [ -1 2 -1 -1 ] + (0-1) [ -1 1 1 1 ] = [ 0 1 -2 -2 ] W = [ 0 1 -2 -2 ] + (0-0) [ -1 0 0 1 ] = [ 0 1 -2 -2 ] W = [ 0 1 -2 -2 ] + (1-0) [ -1 1 0 1 ] = [ -1 2 -2 -1 ] epoch = 3: ---------- W = [ -1 2 -2 -1 ] + (1-1) [ -1 1 0 0 ] = [ -1 2 -2 -1 ] W = [ -1 2 -2 -1 ] + (0-0) [ -1 0 1 1 ] = [ -1 2 -2 -1 ] W = [ -1 2 -2 -1 ] + (1-1) [ -1 1 1 0 ] = [ -1 2 -2 -1 ] W = [ -1 2 -2 -1 ] + (0-0) [ -1 1 1 1 ] = [ -1 2 -2 -1 ] W = [ -1 2 -2 -1 ] + (0-0) [ -1 0 0 1 ] = [ -1 2 -2 -1 ] W = [ -1 2 -2 -1 ] + (1-1) [ -1 1 0 1 ] = [ -1 2 -2 -1 ] There was no change in the weight vector over the entire third epoch. Thus, the procedure has converged and correclty learned the function.
2 x1 - 2 x2 - x3 = -1
and noting that the following points (and all in between) are solutions:
(0, 0, 1), (1, 1, 1), (0, 1/2, 0), (1/2, 1, 0)
h3data=[0 1 1 1 1 0 1 1 1 0 1 0 0 1 0 0 1 0 0 0 0 1 1 1 0]; h4data=[1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 1 0 1 0]; d=h3data; % process h3data % a priori statistics pcherry=[1 .75 .5 .25 0]; ph=[.1 .2 .4 .2 .1]; plime=sum((1-pcherry).*ph); % .* is element-by-element multiplication N=length(d); all=[ph plime]; for i=1:N, Np=sum(d(1:i)); pdgivenh=(pcherry.^Np).*((1-pcherry).^(i-Np)); % .^ is elt-by-elt raise-to-power phgivend=pdgivenh.*ph; phgivend=phgivend/sum(phgivend); % normalize plimegivend=sum((1-pcherry).*phgivend); all=[all; phgivend plimegivend]; end; % list all the results [[-1 d]' all]
The output for h3data was
Cherry=1,
Lime=0 P(h1|d) P(h2|d) P(h3|d) P(h4|d) P(h5|d) P(lime|d)
------------------------------------------------------------------------
-1.0000 0.1000 0.2000 0.4000 0.2000 0.1000 0.5000
0 0 0.1000 0.4000 0.3000 0.2000 0.6500
1.0000 0 0.2143 0.5714 0.2143 0 0.5000
1.0000 0 0.3214 0.5714 0.1071 0 0.4464
1.0000 0 0.4355 0.5161 0.0484 0 0.4032
1.0000 0 0.5473 0.4324 0.0203 0 0.3682
0 0 0.3716 0.5872 0.0413 0 0.4174
1.0000 0 0.4783 0.5039 0.0177 0 0.3848
1.0000 0 0.5832 0.4096 0.0072 0 0.3560
1.0000 0 0.6792 0.3180 0.0028 0 0.3309
0 0 0.5131 0.4805 0.0063 0 0.3733
1.0000 0 0.6141 0.3834 0.0025 0 0.3471
0 0 0.4423 0.5522 0.0055 0 0.3908
0 0 0.2829 0.7066 0.0105 0 0.4319
1.0000 0 0.3735 0.6219 0.0046 0 0.4078
0 0 0.2290 0.7625 0.0085 0 0.4449
0 0 0.1287 0.8570 0.0143 0 0.4714
1.0000 0 0.1826 0.8106 0.0068 0 0.4560
0 0 0.1001 0.8888 0.0111 0 0.4778
0 0 0.0524 0.9302 0.0175 0 0.4913
0 0 0.0267 0.9467 0.0267 0 0.5000
0 0 0.0133 0.9467 0.0400 0 0.5067
1.0000 0 0.0203 0.9595 0.0203 0 0.5000
1.0000 0 0.0304 0.9595 0.0101 0 0.4949
1.0000 0 0.0451 0.9499 0.0050 0 0.4900
0 0 0.0230 0.9693 0.0077 0 0.4962
The output for h4data was
-1.0000 0.1000 0.2000 0.4000 0.2000 0.1000 0.5000
1.0000 0.2000 0.3000 0.4000 0.1000 0 0.3500
0 0 0.2143 0.5714 0.2143 0 0.5000
1.0000 0 0.3214 0.5714 0.1071 0 0.4464
0 0 0.1800 0.6400 0.1800 0 0.5000
0 0 0.0900 0.6400 0.2700 0 0.5450
0 0 0.0413 0.5872 0.3716 0 0.5826
0 0 0.0177 0.5039 0.4783 0 0.6152
0 0 0.0072 0.4096 0.5832 0 0.6440
0 0 0.0028 0.3180 0.6792 0 0.6691
0 0 0.0010 0.2376 0.7613 0 0.6901
1.0000 0 0.0025 0.3834 0.6141 0 0.6529
0 0 0.0010 0.2936 0.7054 0 0.6761
0 0 0.0004 0.2171 0.7825 0 0.6955
0 0 0.0001 0.1561 0.8438 0 0.7109
0 0 0.0000 0.1098 0.8902 0 0.7225
0 0 0.0000 0.0760 0.9240 0 0.7310
0 0 0.0000 0.0520 0.9480 0 0.7370
0 0 0.0000 0.0353 0.9647 0 0.7412
0 0 0.0000 0.0238 0.9762 0 0.7441
1.0000 0 0.0000 0.0465 0.9535 0 0.7384
0 0 0.0000 0.0315 0.9685 0 0.7421
1.0000 0 0.0000 0.0610 0.9390 0 0.7348
0 0 0.0000 0.0415 0.9585 0 0.7396
1.0000 0 0.0000 0.0797 0.9203 0 0.7301
0 0 0.0000 0.0546 0.9454 0 0.7364
In both cases, the true hypothesis eventually dominates and the
prediction of the probability that the next candy is lime or not
closes in on that which would be made by the MAP
hypothesis (0.5 and 0.75, respectively).
W = 0 0 0 0 epoch = 1 W = -1 1 0 0 W = 0 1 -1 -1 W = -1 2 0 -1 W = 0 1 -1 -2 W = 0 1 -1 -2 W = -1 2 -1 -1 epoch =2 W = -1 2 -1 -1 W = -1 2 -1 -1 W = -1 2 -1 -1 W = 0 1 -2 -2 W = 0 1 -2 -2 W = -1 2 -2 -1 epoch = 3 W = -1 2 -2 -1 W = -1 2 -2 -1 W = -1 2 -2 -1 W = -1 2 -2 -1 W = -1 2 -2 -1 W = -1 2 -2 -1This matches with that computed by hand in Problem 10.5(a) above.