Ana səhifə

# Multivariate decision trees for machine learning

Yüklə 5.57 Mb.
 səhifə 14/15 tarix 24.06.2016 ölçüsü 5.57 Mb.

## 6.4.Results for LDA

For the rest of these results, the definition given in Table 6.4.1 applies.

TABLE 6.4.1 Definition of neural-network based methods

 Name Class Separation Pruning PCA PCA percentage ID-LDA Exchange Pre-pruning Always %90 ID-LDA-R Exchange Pre-pruning If Required %90 ID-LDA-R99 Exchange Pre-pruning If Required %99

### 6.4.1.Effects of PCA on the Results

Previously we saw that PCA must be used to solve the singular covariance matrix problem in Chapter 5. But there are also data sets where we do not need PCA in some nodes because the covariance matrix is invertible in those nodes. Hence, we took those data sets performances two times, one time we used always PCA and the other time we used PCA when it is required. In this section we will compare these two results and want to find out if PCA decrements the performance because of the %10 loss in variance. The results are shown in Table 6.4.1.1 and Figure 6.4.1.1 for accuracy, in Table 6.4.1.2 and Figure 6.4.1.2 for tree sizes and in Table 6.4.1.3 and Figure 6.4.1.3 for learning time. Some of the data sets are shown with an asterisk near them. In those data sets, PCA is never required.

If we look at the accuracy results we see that PCA causes a decrease in performance. In three data sets out of five, accuracy is significantly dropped when PCA is applied. In these data sets, PCA is never required. In other data sets where PCA is applied, accuracy does not change significantly.

TABLE 6.4.1.1 Accuracy results for ID-LDA and ID-LDA-R
 Data set name ID-LDA ID-LDA-R Significance Breast* 96.650.66 95.850.72 Bupa* 57.283.23 67.422.97 2>>>1 Ecoli 83.102.50 83.693.58 Glass 57.853.67 55.514.43 Iris* 82.675.52 97.201.47 2>>>1 Monks* 66.341.93 74.312.26 2>>1 Wine* 94.043.18 96.072.66 Zoo 80.796.97 82.565.62

TABLE 6.4.1.2 Node results for ID-LDA and ID-LDA-R
 Data set name ID-LDA ID-LDA-R Significance Breast* 8.001.05 7.200.63 Bupa* 6.404.90 8.201.93 Ecoli 20.204.34 17.604.81 Glass 20.605.64 25.603.78 Iris* 8.002.16 5.400.84 1>>>2 Monks* 3.000.00 7.202.39 2>1 Wine* 7.401.84 5.400.84 Zoo 12.602.07 12.602.07

For the node results, ID-LDA-R is better than ID-LDA in one data set, whereas ID-LDA is better than ID-LDA-R in one data set. ID-LDA is better than ID-LDA-R in Monks data set, where it can not find a split after one split. So it has lower node size. These results also effect learning time. On the Iris data set where the tree size is significantly smaller with ID-LDA-R, learning time is also significantly less.

When PCA is applied, the number of reduced dimensions is usually decreased from the root node to a leaf node. For example, while in the root node we need 14 eigenvectors to define the data on Ecoli data set, we only need 5 eigenvectors to define data in a leaf node.

TABLE 6.4.1.3 Learning time results for ID-LDA and ID-LDA-R

 Data set name ID-LDA ID-LDA-R Significance Breast* 31 20 Bupa* 11 10 Ecoli 62 62 Glass 51 71 Iris* 11 00 1>>2 Monks* 11 10 Wine* 10 10 Zoo 20 20

### 6.4.2.Effects of PCA Percentage on the Results

As Section 6.4.1 shows, LDA performance is decreased when PCA is applied because of the 10% loss ( = 0.90). We have also made experiments with another percentage levels; with %99 ( = 0.99), and compared the results of two. The results are shown in Table 6.4.2.1 and Figure 6.4.2.1 for accuracy, in Table 6.4.2.2 and Figure 6.4.2.2 for tree size and in Table 6.4.2.3, Figure 6.4.2.3 and Figure 6.4.2.4 for learning time.

TABLE 6.4.2.1 Accuracy results for ID-LDA-R and ID-LDA-R99

 Data set name IDA-LDA-R ID-LDA-R99 Significance Car 70.021.75 92.091.07 2>>>1 Cylinder 67.392.39 69.803.01 Dermatology 94.751.91 96.171.59 Ecoli 83.693.58 83.752.53 Flare 88.052.39 88.172.83 Glass 55.514.43 57.294.16 Hepatitis 83.612.12 82.065.60 Horse 72.392.62 81.091.64 2>>1 Ironosphere 86.382.68 91.112.22 Mushroom 94.150.83 98.250.57 2>>>1 Ocrdigits 89.190.92 94.590.49 2>>>1 Pendigits 91.990.94 95.520.44 2>>>1 Segment 82.192.35 90.311.20 2>>>1 Vote 90.852.35 94.852.17 2>1 Zoo 82.565.62 81.417.25

TABLE 6.4.2.2 Node results for ID-LDA-R and ID-LDA-R99

 Data set name IDA-LDA-R ID-LDA-R99 Significance Car 1.000.00 12.002.54 2>>>1 Cylinder 10.803.33 16.408.95 Dermatology 17.002.11 12.801.48 Ecoli 17.604.81 20.002.71 Flare 5.203.71 5.603.13 Glass 25.603.78 26.204.44 Hepatitis 4.603.10 8.603.10 Horse 10.203.29 16.804.05 Ironosphere 5.401.84 11.602.67 Mushroom 17.403.63 19.204.85 Ocrdigits 87.408.37 59.402.07 1>>2 Pendigits 80.803.71 89.006.25 Segment 40.405.66 39.8011.08 Vote 9.203.05 9.802.53 Zoo 12.602.07 11.801.93

When we look at the accuracy results, we see that there is a dramatic increase in accuracy while going from ID-LDA-R to ID-LDA-R99. In seven data sets out of 20, there is a significant increase in accuracy that is especially noticeable on large data sets.

On the Car data set, for which ID-LDA-R can not find any split, ID-LDA-R99 gives a performance of 92 percent. Therefore its node size in ID-LDA-R99 is significantly more than in ID-LDA-R. On Ocrdigits the effect is the opposite, that is, while going from ID-LDA-R to ID-LDA-R99, the accuracy increases and the node size decreases significantly. These results have also an effect on learning time. ID-LDA-R has significantly lower learning time on Car because of no split.

TABLE 6.4.2.3 Learning time results for ID-LDA-R and ID-LDA-R99
 Data set name IDA-LDA-R ID-LDA-R99 Significance Car 62 286 2>>1 Cylinder 3414 7457 Dermatology 182 161 Ecoli 62 61 Flare 32 43 Glass 71 71 Hepatitis 11 21 Horse 3816 9440 Ironosphere 31 114 Mushroom 1846776 25331533 Ocrdigits 3189270 2288108 1>>2 Pendigits 99686 121191 2>1 Segment 15415 14840 Vote 84 93 Zoo 20 20

### 6.4.3.Comparison of Different Linear Multivariate Techniques

In this section we compare three types of linear decision tree construction methods. These are CART (Classification and regression trees), ID-LP (Multivariate decision tree with neural perceptron) and ID-LDA (Multivariate decision tree with linear discriminant ID-LDA-R99). The results are shown in Table 6.4.3.1, Table 6.4.3.2 and Figure 6.4.3.1 for accuracy results, in Table 6.4.3.3, Table 6.4.3.4 and Figure 6.4.3.2 for node results and in Table 6.4.3.5, Table 6.4.3.6, Figure 6.4.3.3, Figure 6.4.3.4 and Figure 6.4.3.5 for learning time results. The exchange method is used for simplicity for ID-LP and ID-LDA.
TABLE 6.4.3.1 Accuracy results for linear decision tree methods
 Data set name CART ID-LP ID-LDA Significance Breast 94.851.44 96.600.61 95.850.72 Bupa 61.743.38 63.532.76 67.422.97 Car 83.842.03 89.484.01 92.091.07 3>>>1 Cylinder 59.524.05 70.214.48 69.803.01 2>>1,3>>>1 Dermatology 80.874.56 85.747.06 96.171.59 3>>1 Ecoli 74.743.80 82.624.06 83.752.53 2>>1,3>1 Flare 81.553.60 88.362.37 88.172.83 Glass 53.934.20 54.957.83 57.294.16 Hepatitis 78.964.04 84.132.86 82.065.60 2>>1 Horse 76.963.02 82.073.48 81.091.64 2>>1 Iris 89.334.44 77.6015.70 97.201.47 3>>1>>2 Ironosphere 86.844.03 87.802.18 91.112.22 Monks 91.206.89 66.341.87 74.312.26 1>>3>>>2 Mushroom 93.451.75 99.950.03 98.250.57 2>>3>>1 Ocrdigits 81.352.08 93.870.92 94.590.49 3>>>1,2>>>1 Pendigits 87.102.91 91.944.16 95.520.44 3>>1 Segment 88.071.69 79.7611.58 90.311.20 Vote 90.303.17 94.711.05 94.852.17 2>1,3>>1 Wine 87.304.40 87.7512.62 96.072.66 3>1 Zoo 69.929.69 79.388.10 81.417.25

TABLE 6.4.3.2 Accuracy comparisons
 Method CART ID-LP ID-LDA CART 2 1 ID-LP 7 1 ID-LDA 10 2

TABLE 6.4.3.3 Node results for linear decision tree methods
 Data set name CART ID-LP ID-LDA Significance Breast 11.602.67 3.000.00 7.200.63 3>>>2,1>>2 Bupa 43.203.82 4.601.84 8.201.93 1>>>3>>2 Car 29.003.40 7.400.84 12.002.54 1>>>2,1>>>3 Cylinder 45.004.90 8.401.90 16.408.95 1>>3,1>>>2 Dermatology 28.004.74 8.801.48 12.801.48 1>>3>2 Ecoli 34.005.01 10.802.90 20.002.71 1>3>>>2 Flare 33.806.20 3.202.20 5.603.13 1>>>3,1>>>2 Glass 42.404.12 10.204.64 26.204.44 1>>>3>>>2 Hepatitis 14.003.43 3.000.00 8.603.10 1>>>2 Horse 28.005.19 5.001.63 16.804.05 1>>>3>>>2 Iris 10.202.35 4.001.05 5.400.84 1>>3,1>>2 Ironosphere 16.403.78 3.801.03 11.602.67 1>>2,3>>>2 Monks 17.8010.16 3.000.00 7.202.39 1>>>2 Mushroom 43.006.53 3.000.00 19.204.85 1>>3>>2 Ocrdigits 70.803.98 34.804.94 59.402.07 1>3>>>2 Pendigits 77.8010.08 30.406.40 89.006.25 3>>>2,1>>>2 Segment 45.208.97 16.606.65 39.8011.08 1>>2 Vote 17.205.29 4.201.93 9.802.53 1>>>2 Wine 9.402.27 4.400.97 5.400.84 1>2 Zoo 25.204.94 8.801.75 11.801.93 1>>>3,1>>>2

TABLE 6.4.3.4 Node comparisons
 Method CART ID-LP ID-LDA CART 0 0 ID-LP 20 10 ID-LDA 12 0

TABLE 6.4.3.5 Learning time results for linear decision tree methods

 Data set name CART ID-LP ID-LDA Significance Breast 10717 50 20 1>>>2>>>3 Bupa 25223 31 10 1>>>2>>3 Car 1178148 15216 286 1>>>2>>>3 Cylinder 4589343 192 7457 1>>>2,1>>>3 Dermatology 858170 429 161 1>>>2>>3 Ecoli 22125 5715 61 1>>>2>>>3 Flare 1032203 94 43 1>>>2,1>>>3 Glass 32025 339 71 1>>>2>>>3 Hepatitis 20947 10 21 1>>>2,1>>>3 Horse 34811101 142 9440 1>>3>>2 Iris 3111 30 00 1>>2>>>3 Ironosphere 54494 41 114 1>>>3>2 Monks 12661 30 10 1>>2>>>3 Mushroom 336132942 628204 25331533 1>>>2,1>>>3 Ocrdigits 9148713 8035757 2288108 2>>>3,1>>>3 Pendigits 3311350 183403319 121191 2>>>1>>>3 Segment 1212170 937103 14840 1>2>>>3, Vote 805167 61 93 1>>>2,1>>>3 Wine 8426 41 10 1>>>2>>3 Zoo 45361 102 20 1>>>2>>>3

TABLE 6.4.3.6 Learning time comparisons
 Method CART ID-LP ID-LDA CART 1 0 ID-LP 18 2 ID-LDA 20 13

If we compare the three linear methods in terms of accuracy, node size and learning time, we see that:

• Accuracy: ID-LP=ID-LDA>CART.

• Node Size: CART>ID-LDA>ID-LP.

• Learning Time: CART >ID-LP>ID-LDA.

In terms of accuracy, CART outperforms ID-LP in those data sets where ID-LP does not always converge. On the Monks data set, CART outperforms ID-LP and ID-LDA quite significantly.

Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©atelim.com 2016
rəhbərliyinə müraciət