A? B?
0/ \1 0/ \1
0 B? OR A? 0
0/ \1 0/ \1
1 0 0 1
A?
0/ \1
B? 1
0/ \1 [OR 5 OTHERS]
0 C?
0/ \1
0 1
A? B?
0/ \1 0/ \1
B? B? OR A? A?
0/\1 0/\1 0/\1 0/\1
0 1 1 0 0 1 1 0
A?
/ \
/ \
0/ \1
/ \
/ \
C? B? [OR 23 OTHERS]
0/ \1 0/ \1
0 D? C? 1
0/ \1 0/ \1
0 1 0 D?
0/ \1
0 1
a2-T: ++-- \F: -+
--------
| Sky? |
--------
Rainy / \ Sunny
No Yes
E(S) = E(3+,2-) = 0.9710
Gain( Sky ) = 0.9710 - (4/5) E(3+,1-) - (1/5) E(0+,1-) = 0.3220
Gain( AirTemp) = 0.9710 - (4/5) E(3+,1-) - (1/5) E(0+,1-) = 0.3220
Gain(Humidity) = 0.9710 - (3/5) E(2+,1-) - (2/5) E(1+,1-) = 0.0200
Gain( Wind ) = 0.9710 - (4/5) E(3+,1-) - (1/5) E(0+,1-) = 0.3220
Gain( Water ) = 0.9710 - (4/5) E(2+,2-) - (1/5) E(1+,0-) = 0.1710
Gain(Forecast) = 0.9710 - (3/5) E(2+,1-) - (2/5) E(1+,1-) = 0.0200
Thus, we split on Sky:
--------
| Sky? |
--------
Rainy / \ Sunny
No -----
| ? |
-----
E(S_Sunny) = E(3+,1-) = 0.8113
Gain( AirTemp) = 0.8113 - (4/4) E(3+,1-) = 0.0000
Gain(Humidity) = 0.8113 - (2/4) E(2+,0-) - (2/4) E(1+,1-) = 0.3113
Gain( Wind ) = 0.8113 - (3/4) E(3+,0-) - (1/4) E(0+,1-) = 0.8113
Gain( Water ) = 0.8113 - (3/4) E(2+,1-) - (1/4) E(1+,0-) = 0.1226
Gain(Forecast) = 0.8113 - (3/4) E(2+,1-) - (1/4) E(1+,0-) = 0.1226
Thus, we split on Wind:
--------
| Sky? |
--------
Rainy / \ Sunny
No --------
| Wind? |
---------
Weak / \ Strong
No Yes
(It is OK if you stopped the computations after examining Wind.)
P(Majority makes error) = e5 + 5 e4(1-e) + 10 e3(1-e)2Note: 5 = (5 choose 4) = (5 choose 1) and 10 = (5 choose 3) = (5 choose 2), depending on whether you wish to think of picking the ones that are wrong or right, respectively.
For e=0.1,
P(Majority makes error) = (0.1)5 + 5 (0.1)4(0.9) + 10 (0.1)3(0.9)2 = 0.00001 + 0.00045 + .0081 = 0.00856
The Lagrange interpolating polynomial is
(x-x2) (x-x1)
P(x) = ------- y1 + ------- y2
(x1-x2) (x2-x1)
(x-3) (x-1)
= ----- 2 + ----- 5
(1-3) (3-1)
= (3/2) x + 1/2
The Lagrange interpolating polynomial is
(x-0)(x-1) (x+1)(x-1) (x+1)(x-0)
P(x) = ------------(.1) + ----------(0) + ----------(.1)
(-1-0)(-1-1) (0+1)(0-1) (1+1)(1-0)
= (1/20)[(x^2 - x) + (x^2 + x)]
= (1/10) x^2
From Equation (1) in the handout for LS linear fitting: [ a ] [ 3 0 ]-1 [ 0.2 ] 1 [ 2 0 ] [ 0.2 ] [ 2/30 ] [ ] = [ ] [ ] = --- [ ] [ ] = [ ] [ b ] [ 0 2 ] [ 0 ] 6 [ 0 3 ] [ 0 ] [ 0 ] Thus, the LS fit line is y = (2/30) = 0.0667
target values: Democrat (D), Republican (R)
124 D, 108 R
E=0.9966
--------------------------
| ATTRIBUTE #5 |
| (Physician Fee Freeze) |
| Gain(#5)=0.8148 |
--------------------------
/ \
0 / \ 1
/ \
/ \
-------------- --------------
| ATT#5=N | | ATT#5=Y |
| 118 D, 1 R | | 6 D, 107 R |
------------- --------------
Predict D Predict R
In the data file,
0 = Democrat (D), 1 = Republican (R)
0 = no (N), 1 = yes (Y)
My code has been suppressed.