Ana səhifə

Multivariate decision trees for machine learning


Yüklə 5.57 Mb.
səhifə12/15
tarix24.06.2016
ölçüsü5.57 Mb.
1   ...   7   8   9   10   11   12   13   14   15

6.2.Results for Classification and Regression Trees


For the rest of these results, the definition given in Table 6.2.1 applies.

TABLE 6.2.1 Definition of tree-based methods



Name of the Method

Uni/Multi

Impurity Measure

Pruning

Feature Selection

ID3

Uni

Information Gain

Pre-pruning

No

CART

Multi

Information Gain

Pre-pruning

No

FSCART

Multi

Information Gain

Pre-pruning

Yes

In this section, the multivariate method CART and the univariate method ID3 are compared. Also feature selection is applied to CART to see if there will be a difference in accuracy or node size. The results of this comparison are shown in Table 6.2.2 and in Figure 6.2.1. Node results are shown in Table 6.2.3 and Figure 6.2.2. Learning time results are shown in Table 6.2.3, Figure 6.2.3 and Figure 6.2.4.

ID3 is statistically significantly better than CART in six out 20 data sets. ID3 is better then FSCART in two cases with 95% and FSCART is better then ID3 in one case with 99% confidence. So we can say that no one of the three methods is clearly the best.

Concerning the tree size, generally multivariate techniques FSCART and CART perform better than ID3 and this is significant in six data sets. Note that a CART node internally is more complex then an ID3 node.

The learning times of univariate technique ID3 is significantly better than multivariate techniques CART and FSCART.

Feature selection improves accuracy and node size generally but increases learning time significantly.

TABLE 6.2.2 Accuracy results for ID3 and CART



Data set name

ID3

CART

FSCART

Significance

Breast

94.111.24

94.851.44

94.651.23




Bupa

62.265.33

61.743.38

61.502.48




Car

80.971.26

83.842.03

78.347.40

1>3,2>>3

Cylinder

68.502.22

59.524.05

N/A

1>2

Dermatology

92.842.37

80.874.56

83.727.66

1>>2

Ecoli

78.103.57

74.743.80

76.903.89

1>>3

Flare

85.262.03

81.553.60

85.753.89




Glass

60.655.97

53.934.20

58.136.58




Hepatitis

78.443.71

78.964.04

78.583.72




Horse

87.551.98

76.963.02

N/A

1>>2

Iris

93.872.75

89.334.44

90.404.48




Ironosphere

87.633.15

86.844.03

84.782.78

1>3

Monks

92.2710.15

91.206.89

82.878.09




Mushroom

99.700.06

93.451.75

N/A

1>>2

Ocrdigits

78.401.47

81.352.08

N/A




Pendigits

85.731.01

87.102.91

91.470.86

3>>>1

Segment

91.081.16

88.071.69

92.461.71

3>2

Vote

94.941.06

90.303.17

90.443.88

1>>2

Wine

88.653.72

87.304.40

93.033.62




Zoo

92.064.80

69.929.69

69.338.93

1>>2,1>>3

TABLE 6.2.3 Node results for ID3 and CART



Data set name

ID3

CART

FSCART

Significance

Breast

17.002.11

11.602.67

10.802.39

1>2,1>3

Bupa

53.405.48

43.203.82

40.604.20

1>2,1>>3

Car

25.400.70

29.003.40

30.004.14




Cylinder

54.105.90

45.004.90

N/A




Dermatology

20.402.67

28.004.74

20.803.46

3>>1

Ecoli

33.802.70

34.005.01

31.403.24




Flare

37.904.51

33.806.20

25.809.48

1>>2

Glass

38.205.90

42.404.12

38.204.34




Hepatitis

19.603.78

14.003.43

11.601.90

1>>3

Horse

55.805.92

28.005.19

N/A

1>>2

Iris

8.401.35

10.202.35

8.201.40




Ironosphere

19.203.05

16.403.78

16.003.68




Monks

25.4013.53

17.8010.16

11.402.27

1>2,1>>>3

Mushroom

23.000.00

43.006.53

N/A

2>>>1

Ocrdigits

74.404.01

70.803.98

N/A

1>>2

Pendigits

81.805.51

77.8010.08

71.005.16




Segment

41.803.79

45.208.97

36.804.57




Vote

18.203.16

17.205.29

18.205.75




Wine

10.401.35

9.402.27

9.001.33




Zoo

15.001.89

25.204.94

16.402.32

2>>1,2>>>3

TABLE 6.2.4 Learning time results for ID3 and CART



Data set name

ID3

CART

FSCART

Significance

Breast

20

10717

482122

3>>>2>>>1

Bupa

31

25223

829119

3>>>2>>>1

Car

50

1178148

130561661

3>>>2>>>1

Cylinder

102

4589343

N/A

2>>>1

Dermatology

30

858170

105531748

3>>>2>>>1

Ecoli

30

22125

85976

3>>>2>>>1

Flare

20

1032203

88923203

3>>2>>>1

Glass

30

32025

1481154

3>>>2>>>1

Hepatitis

10

20947

1709265

3>>>2>>>1

Horse

40

34811101

N/A

2>>1

Iris

00

3111

6917

3>2>>1

Ironosphere

397

54494

86641662

3>>>2>>>1

Monks

21

12661

27355

3>>>2>>1

Mushroom

11333

336132942

N/A

2>>>1

Ocrdigits

2079

9148713

N/A

2>>>1

Pendigits

47622

3311350

245441374

3>>>2>>>1

Segment

34510

1212170

9173905

3>>>2>>>1

Vote

10

805167

130583920

3>>>2>>>1

Wine

10

8426

566147

3>>>2>>>1

Zoo

10

45361

2088375

3>>>2>>>1









1   ...   7   8   9   10   11   12   13   14   15


Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©atelim.com 2016
rəhbərliyinə müraciət