Ana səhifə

Multivariate decision trees for machine learning


Yüklə 5.57 Mb.
səhifə13/15
tarix24.06.2016
ölçüsü5.57 Mb.
1   ...   7   8   9   10   11   12   13   14   15

6.3.Results for Neural Network Methods


For the rest of these results, the definition given in Table 6.3.1 applies.

TABLE 6.3.1 Definition of neural-network based methods



Name

Class Separation

Impurity Measure

Pruning

Linearity

ID-LPS

Selection

Information Gain

Pre-pruning

Linear

ID-LPE

Exchange

Information Gain

Pre-pruning

Linear

ID-MLPE

Exchange

Information Gain

Pre-pruning

Nonlinear

ID-Hybrid-F

Exchange

Information Gain

Pre-pruning

Both with F-test

ID-Hybrid-t

Exchange

Information Gain

Pre-pruning

Both with t-test

6.3.1.Comparison of Class Separation Techniques


The aim of this section is to find which class separation technique (selection or exchange) is better than the other. For simplicity, other variables such as impurity measure or pruning technique are fixed. If there are only two classes available in a data set, it is not included in the results because there will be no class separation. The results are shown in Table 6.3.1.1 and Figure 6.3.1.1. Node results are shown in Table 6.3.1.2 and Figure 6.3.1.2. Learning time results are shown in Table 6.3.1.3, Figure 6.3.1.4 and Figure 6.3.1.5.

In none of the data sets, the selection method is more accurate than the exchange method in accuracy. But the exchange method is more accurate than selection method in three data sets. Two of these data sets Ocrdigits and Pendigits have 10 classes. The other data set Ecoli has eight classes. So we can conclude that, the more classes you have, the better is the exchange method compared to the selection method, due to the large number of division candidates.

If the node size results are compared, it is also seen that in two data sets, Pendigits and Glass (which has eight classes), out of 11, the exchange method is better than the selection method while the selection is never better.

As we have explained, the exchange method has larger time complexity. So in all data sets except one, the selection method is better than the exchange method in terms of learning time. This significance also increases with the size of the data set and the number of classes.

TABLE 6.3.1.1 Accuracy results for ID-LPS and ID-LPE


Data set name

ID-LPS

ID-LPE

Significance

Car

87.503.07

89.484.01




Dermatology

69.5122.01

85.747.06




Ecoli

68.515.39

82.624.06

2>1

Flare

88.172.21

88.362.37




Glass

55.536.16

54.957.83




Iris

81.7314.40

77.6015.70




Ocrdigits

54.146.25

93.870.92

2>>>1

Pendigits

67.465.44

91.944.16

2>>>1

Segment

70.556.68

79.7611.58




Wine

85.0614.00

87.7512.62




Zoo

78.017.67

79.388.10



TABLE 6.3.1.2 Node results for ID-LPS and ID-LPE



Data set name

ID-LPS

ID-LPE

Significance

Car

11.406.10

7.400.84




Dermatology

7.403.10

8.801.48




Ecoli

15.005.33

10.802.90




Flare

2.801.99

3.202.20




Glass

20.803.46

10.204.64

1>>2

Iris

5.603.13

4.001.05




Ocrdigits

45.204.76

34.804.94




Pendigits

58.409.52

30.406.40

1>>>2

Segment

28.606.31

16.606.65




Wine

4.201.03

4.400.97




Zoo

11.402.07

8.801.75




TABLE 6.3.1.3 Learning time results for ID-LPS and ID-LPE



Data set name

ID-LPS

ID-LPE

Significance

Car

7917

15216

2>>>1

Dermatology

224

429

2>>1

Ecoli

224

5715

2>>1

Flare

52

94

2>>1

Glass

131

339

2>>>1

Iris

20

30

2>1

Ocrdigits

2764384

8035757

2>>>1

Pendigits

4164246

183403319

2>>>1

Segment

40763

937103

2>>>1

Wine

20

41




Zoo

51

102

2>>1









6.3.2.Comparison of Hybrid Tests in Decision Nodes for Neural Networks


The aim of this section is to find which test measure (F-test or t-test) is best in comparing the performance of hybrid trees. In big data sets as Mushroom, Ocrdigits, Pendigits and Segment, training is done with 10 epochs instead of 50 epochs. This is due to the large amount of computation to train the networks with t-test. For example training with t-test of Ocrdigits data set takes approximately 4 days, where we have 160 runs like that.

TABLE 6.3.2.1 Accuracy results for hybrid network models



Data set name

ID-Hybrid-F

ID-Hybrid-t

Significance

Breast

96.620.55

96.620.63




Bupa

63.422.57

63.713.24




Car

94.511.15

92.191.37

1>2

Cylinder

71.311.74

71.241.89




Dermatology

94.544.67

85.7411.97




Ecoli

83.104.19

81.433.75




Flare

88.112.43

87.982.28




Glass

55.059.72

60.376.60




Hepatitis

83.743.41

83.483.38




Horse

82.662.58

82.013.28




Iris

92.673.28

92.803.34




Ironosphere

87.802.15

87.351.79




Monks

66.391.85

66.301.77




Mushroom

99.960.03

99.950.03




Ocrdigits

92.792.20

N/A




Pendigits

90.829.62

N/A




Segment

81.7712.97

85.136.33




Vote

94.711.13

94.801.06




Wine

96.072.07

95.962.32




Zoo

86.935.39

86.744.15



TABLE 6.3.2.2 Node results for hybrid network models



Data set name

ID-Hybrid-F

ID-Hybrid-t

Significance

Breast

3.000.00

3.000.00




Bupa

4.401.90

4.401.65




Car

7.600.97

7.201.48




Cylinder

8.801.75

9.002.11




Dermatology

11.201.14

11.000.00




Ecoli

10.602.27

10.802.57




Flare

3.001.33

2.400.97




Glass

11.005.50

11.802.70




Hepatitis

3.000.00

3.000.00




Horse

5.602.84

5.201.75




Iris

5.000.00

5.000.00




Ironosphere

4.001.05

3.801.03




Monks

3.000.00

3.000.00




Mushroom

3.000.00

3.000.00




Ocrdigits

25.403.75

N/A




Pendigits

23.405.80

N/A




Segment

14.402.84

14.603.10




Vote

4.201.93

4.401.90




Wine

5.000.00

5.200.63




Zoo

12.401.90

12.601.26




The results are shown in Table 6.3.2.1 and Figure 6.3.2.1. Node results are shown in Table 6.3.2.2 and Figure 6.3.2.2. Learning time results are shown in Table 6.3.2.3, Figure 6.3.2.4 and Figure 6.3.2.5.

There is no significant difference in terms of accuracy and node size between the two test selection measures (Only in Car in terms of accuracy).

But the difference is in learning time. In all data sets, t-test is slower than F-test with over than %99 level. Because t-test runs with 30 fold cross validation with the whole training set while F-test runs only 10 fold cross validation with half of the training set.

TABLE 6.3.2.3 Learning time results for hybrid network models



Data set name

ID-Hybrid-F

ID-Hybrid-t

Significance

Breast

301

1754

2>>>1

Bupa

153

9117

2>>>1

Car

911151

6422656

2>>>1

Cylinder

36645

2455346

2>>>1

Dermatology

39925

3428899

2>>>1

Ecoli

21938

1697338

2>>>1

Flare

6726

420134

2>>1

Glass

13728

1275281

2>>>1

Hepatitis

100

661

2>>>1

Horse

32789

2089392

2>>>1

Iris

141

9511

2>>>1

Ironosphere

539

30059

2>>>1

Monks

160

971

2>>>1

Mushroom*

5529414

8546934

2>>>1

Ocrdigits*

147913204

N/A

2>>>1

Pendigits*

99421566

N/A

2>>>1

Segment*

4756453

6217529

2>>1

Vote

5311

37684

2>>>1

Wine

192

12526

2>>>1

Zoo

6610

42280

2>>>1










6.3.3.Comparison of the Network Structures in Decision Nodes


We must also determine the type of the neural network to train in a decision tree. So we must find out which type of neural network performs the best. In order to accomplish this task, we have three different types of networks: Linear perceptron, multilayer perceptron and a hybrid of them (with F-test). Multilayer perceptron is a nonlinear method. These three networks are compared according to accuracy, node size and learning time. Accuracy results are shown in Table 6.3.3.1 and Figure 6.3.3.1. Node results are shown in Table 6.3.3.2 and Figure 6.3.3.2. Learning time results are shown in Table 6.3.3.3, Figures 6.3.3.3, 6.3.3.4 and 6.3.3.5.

Linear neural network methods results can be divided into two groups. Data sets having two classes and data sets having more than two classes. If the data set has two classes and if the classes are not linearly separable, the accuracy results can be very low. But if they are linearly separable, the results can be very good as in Breast data set. For nonlinear network models, results are higher in such data sets. More generally in three data sets out of 20, the nonlinear model outperforms the linear model and in two data sets, the hybrid model outperforms the linear model. In four data sets; Ocrdigits, Dermatology, Zoo and Segment, the nonlinear model has good results but does not converge all the time. So these data sets have larger variance.

If we look at the node results, the nonlinear model is better than linear model in two data sets and it is better than the hybrid model in two data sets. Any data set having c classes must have at least 2c-1 nodes so that each class can be in one leaf. The nonlinear model converges to the optimum solution in the number of nodes. There is an order between the node size of the models as Linear > Hybrid > Nonlinear model. In some data sets we see that the hybrid model performs worse than the other two in terms of node size. This mainly depends on the deviation of the results. The nonlinear model outperforms the hybrid model in four data sets and the linear model in two data sets. The hybrid and linear models outperform each other in only one data set.

In terms of the time consumed for learning, linear model performs the best as we have expected. If we compare times we see an ordering as Hybrid > Nonlinear > Linear. But sometimes, the linear model has larger training time than the nonlinear model, which is due to the large number of nodes in the tree with the linear model and the large number of instances of that data set.



TABLE 6.3.3.1 Accuracy results for different network models

Data set name

ID-LPE

ID-MLPE

ID-Hybrid-F

Significance

Breast

96.600.61

96.770.91

96.620.55




Bupa

63.532.76

63.244.31

63.422.57




Car

89.484.01

96.862.30

94.511.15




Cylinder

70.214.48

70.359.56

71.311.74




Dermatology

85.747.06

87.8113.59

94.544.67




Ecoli

82.624.06

80.125.12

83.104.19




Flare

88.362.37

87.672.56

88.112.43




Glass

54.957.83

58.0413.30

55.059.72




Hepatitis

84.132.86

83.742.43

83.743.41




Horse

82.073.48

84.672.64

82.662.58




Iris

77.6015.70

92.673.57

92.673.28

2>1,3>1

Ironosphere

87.802.18

87.522.11

87.802.15




Monks

66.341.87

66.992.17

66.391.85




Mushroom

99.950.03

99.990.02

99.960.03

2>1

Ocrdigits

93.870.92

83.9010.22

92.792.20




Pendigits

91.944.16

91.356.55

90.829.62




Segment

79.7611.58

80.3512.36

81.7712.97




Vote

94.711.05

95.581.72

94.711.13




Wine

87.7512.62

95.962.13

96.072.07

2>1,3>1

Zoo

79.388.10

85.3311.86

86.935.39




TABLE 6.3.3.2 Node results for different network models



Data set name

ID-LPE

ID-MLPE

ID-HybridEf

Significance

Breast

3.000.00

3.000.00

3.000.00




Bupa

4.601.84

3.801.03

4.401.90




Car

7.400.84

6.601.26

7.600.97




Cylinder

8.401.90

6.401.65

8.801.75

3>>2

Dermatology

8.801.48

9.602.32

11.201.14

3>1

Ecoli

10.802.90

8.802.20

10.602.27




Flare

3.202.20

2.201.03

3.001.33




Glass

10.204.64

7.203.71

11.005.50




Hepatitis

3.000.00

3.000.00

3.000.00




Horse

5.001.63

4.001.41

5.602.84




Iris

4.001.05

5.000.00

5.000.00

2>>1,3>>1

Ironosphere

3.801.03

3.801.03

4.001.05




Monks

3.000.00

3.000.00

3.000.00




Mushroom

3.000.00

3.000.00

3.000.00




Ocrdigits

34.804.94

18.403.53

25.403.75

1>3>>2

Pendigits

30.406.40

17.601.35

23.405.80

1>>>2,3>>2

Segment

16.606.65

11.602.12

14.402.84




Vote

4.201.93

3.000.00

4.201.93




Wine

4.400.97

5.000.00

5.000.00




Zoo

8.801.75

10.602.80

12.401.90

3>1

TABLE 6.3.3.3 Learning time results for different network models



Data set name

ID-LPE

ID-MLPE

ID-HybridEf

Significance

Breast

50

70

301

3>>>1,3>>>2

Bupa

31

31

153

3>>>2>>>1

Car

15216

21618

911151

3>>>2>>>1

Cylinder

192

10215

36645

3>>>2>>>1

Dermatology

429

12219

39925

3>>>2>>>1

Ecoli

5715

5013

21938

3>>>1,3>>>2

Flare

94

187

6726

3>>2>>1

Glass

339

278

13728

3>>>1,3>>>2

Hepatitis

10

30

100

3>>>1,3>>>2

Horse

142

9720

32789

3>2>>>1

Iris

30

30

141

3>>>1,3>>>2

Ironosphere

41

142

539

3>>>2>>>1

Monks

30

30

160

3>>>1,3>>>2

Mushroom

628204

1858270

5529414

3>>>2>>>1

Ocrdigits*

8035757

109931402

147913204

3>>>2>1

Pendigits*

183403319

84731742

99421566

3>>>1>>2

Segment

937103

927126

4756453

3>>>1,3>>>2

Vote

61

132

5311

3>>2>>1

Wine

41

41

192

3>>>1,3>>>2

Zoo

102

184

6610

3>>>2>>>1











1   ...   7   8   9   10   11   12   13   14   15


Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©atelim.com 2016
rəhbərliyinə müraciət