EECS 391/491: PROBLEM SET #9 SOLUTIONS
Prof. Michael S. Branicky

Problem 9.1

  1. A and (not B)
              A?                 B?
            0/ \1              0/ \1
            0   B?     OR      A?  0
              0/ \1          0/ \1
              1   0          0   1
    
  2. A or (B and C)
              A?
            0/ \1
            B?  1
          0/ \1       [OR 5 OTHERS]
          0   C?
            0/ \1
            0   1
    
  3. A xor B
              A?                 B?
            0/ \1              0/ \1
           B?   B?     OR     A?   A?
          0/\1 0/\1          0/\1 0/\1
          0  1 1  0          0  1 1  0
    
  4. (A and B) or (C and D)
               A?
              / \
             /   \
           0/     \1
           /       \
          /         \
         C?          B?      [OR 23 OTHERS]
       0/ \1       0/ \1
       0   D?     C?   1
         0/ \1  0/ \1
         0   1  0   D?
                  0/ \1
                  0   1
    
(Many) other answers are possible for each of these, as noted.

Problem 9.2

  1. Entropy(3+,3-) = I(3/6, 3/6) = 1.
  2. Remainder(a2) = (4/6) I(2/4, 2/4) + (2/6) I(1/2, 1/2) = 1.
    Gain(a2) = 1 - Remainder(a2) = 0.
    a2-T: ++--
      \F: -+
    

Problem 9.3

Solutions can use either the information gain or entropy notation.
  1.         --------
            | Sky? |
            --------
      Rainy  /    \ Sunny
            No    Yes
    
  2. E(S) = E(3+,2-) = 0.9710
    
    Gain(   Sky  ) = 0.9710 - (4/5) E(3+,1-) - (1/5) E(0+,1-) = 0.3220
    Gain( AirTemp) = 0.9710 - (4/5) E(3+,1-) - (1/5) E(0+,1-) = 0.3220
    Gain(Humidity) = 0.9710 - (3/5) E(2+,1-) - (2/5) E(1+,1-) = 0.0200
    Gain(  Wind  ) = 0.9710 - (4/5) E(3+,1-) - (1/5) E(0+,1-) = 0.3220
    Gain(  Water ) = 0.9710 - (4/5) E(2+,2-) - (1/5) E(1+,0-) = 0.1710
    Gain(Forecast) = 0.9710 - (3/5) E(2+,1-) - (2/5) E(1+,1-) = 0.0200
    
    Thus, we split on Sky:
    
            --------
            | Sky? |
            --------
      Rainy  /    \ Sunny
            No   -----
                 | ? |
                 -----
    
    E(S_Sunny) = E(3+,1-) = 0.8113
    
    Gain( AirTemp) = 0.8113 - (4/4) E(3+,1-)                  = 0.0000
    Gain(Humidity) = 0.8113 - (2/4) E(2+,0-) - (2/4) E(1+,1-) = 0.3113
    Gain(  Wind  ) = 0.8113 - (3/4) E(3+,0-) - (1/4) E(0+,1-) = 0.8113
    Gain(  Water ) = 0.8113 - (3/4) E(2+,1-) - (1/4) E(1+,0-) = 0.1226
    Gain(Forecast) = 0.8113 - (3/4) E(2+,1-) - (1/4) E(1+,0-) = 0.1226
    
    Thus, we split on Wind:
    
            --------
            | Sky? |
            --------
      Rainy  /    \ Sunny
            No  --------
                | Wind? |
                ---------
             Weak /    \ Strong
                 No    Yes
    
    (It is OK if you stopped the computations after examining Wind.)

Problem 9.4

Below, let e=epsilon.
P(Majority makes error) = e5 + 5 e4(1-e) + 10 e3(1-e)2
Note: 5 = (5 choose 4) = (5 choose 1) and 10 = (5 choose 3) = (5 choose 2), depending on whether you wish to think of picking the ones that are wrong or right, respectively.

For e=0.1,

P(Majority makes error) = (0.1)5 + 5 (0.1)4(0.9) + 10 (0.1)3(0.9)2 = 0.00001 + 0.00045 + .0081 = 0.00856

Problem 9.5

  1. Given: (x1,y1)=(1,2); (x2,y2)=(3,5)
    The Lagrange interpolating polynomial is
    
               (x-x2)      (x-x1)
       P(x) = ------- y1 + ------- y2
              (x1-x2)      (x2-x1)
    
              (x-3)     (x-1)
            = ----- 2 + ----- 5
              (1-3)     (3-1)
    
            = (3/2) x + 1/2
    
  2. Given: (x1,y1)=(-1,.1); (x2,y2)=(0,0); (x3,y3)=(1,.1)
    The Lagrange interpolating polynomial is
    
               (x-0)(x-1)        (x+1)(x-1)       (x+1)(x-0)
       P(x) = ------------(.1) + ----------(0) +  ----------(.1)
              (-1-0)(-1-1)       (0+1)(0-1)       (1+1)(1-0)
    
            = (1/20)[(x^2 - x) + (x^2 + x)]
    
            = (1/10) x^2
    
  3. Given: (x1,y1)=(-1,.1); (x2,y2)=(0,0); (x3,y3)=(1,.1)
    From Equation (1) in the handout for LS linear fitting:
    
       [ a ]   [ 3  0 ]-1 [ 0.2 ]    1  [ 2  0 ] [ 0.2 ]   [ 2/30 ]
       [   ] = [      ]   [     ] = --- [      ] [     ] = [      ]
       [ b ]   [ 0  2 ]   [  0  ]    6  [ 0  3 ] [  0  ]   [  0   ]
    
    Thus, the LS fit line is y = (2/30) = 0.0667
    
Sketches have been suppressed.

Problem 9.6

The split should be done on ATTRIBUTE #5 (Physician Fee Freeze) The decision stump is as follows.
target values: Democrat (D), Republican (R)

               124 D, 108 R
                 E=0.9966
        --------------------------
        |    ATTRIBUTE #5        |
        | (Physician Fee Freeze) |
        |   Gain(#5)=0.8148      |
        --------------------------
            /                \
       0   /                  \   1
          /                    \
         /                      \
  --------------         --------------
  |  ATT#5=N   |         |  ATT#5=Y   |
  | 118 D, 1 R |         | 6 D, 107 R |
  -------------          --------------
     Predict D              Predict R

In the data file,
   0 = Democrat (D), 1 = Republican (R)
   0 = no (N), 1 = yes (Y)
My code has been suppressed.
Author: Michael S. Branicky.