1)

Clustering algorithms:
a) OKM
i) Preprocessing method: SVD, 2 dimension

Figure 2: BIC Values when applying OKM (SVD reduced to 2 dimensions) on the colon
dataset
Figure 3: Comparison of the internal (BIC) and
external (Jaccard) criteria of the colon dataset (OKM)
ii) Preprocessing method: SVD, 3 dimension

Figure 4: BIC Values when applying OKM (SVD reduced to 3 dimensions) on the
colon dataset

Figure 5: BIC Values when applying OKM (2-5 clusters) on the colon dataset

Figure 6: Comparison of the internal (BIC) and external (Jaccard) criteria of the colon dataset (OKM)
iii) Preprocessing method: PCA, 2 dimension[1]

Figure 7: BIC Values when applying OKM (PCA reduced to 2 dimensions) on the colon dataset

Figure 8: Comparison of the internal (BIC) and external (Jaccard) criteria of the colon dataset (OKM)
iv) Preprocessing method: PCA, 3 dimension

Figure 9: BIC Values when applying OKM (PCA reduced to 3 dimensions) on the colon dataset

Figure 10: Comparison of the internal (BIC) and external (Jaccard) criteria of the colon dataset (OKM)
v) Comparison of preprocessing methods
b)

OKM
i) Preprocessing method: SVD, 3 dimesions

Figure 12: Comparison of the internal (BIC) and external (Jaccard) criteria of the
colon dataset (OQC)

Figure
13: Comparison of the standard and optimized version of
the KM and QC algorithms

Figure
14: Comparison of the various
clustering algorithms (results according to Sharan and Shamir, 2003 and Getz et al., 2000)
|
Method |
Jaccard |
|
K-Means (raw data) |
0.345 |
|
QC (raw data) |
0.4 |
|
K-Means (Preprocessing & BIC) |
0.678 |
|
QC (Preprocessing &
BIC) |
0.715 |
|
CLICK (Sharan &
Shamir, 2003) |
0.64 |
|
CAST (Sharan & Shamir,
2003, Ben Dor et al.,1999) |
0.682 |
|
CTWC (Getz et al. 2000, and [2]) |
0.508 |
Table 1. Comparison of the Clustering Performance of the Colon
[1] : from matlab: “princomp centers X by subtracting off column means, but does not rescale the columns of X. To perform principal components analysis with standardized variables, that is, based on correlations, use princomp(zscore(X)).”