PERBANDINGAN METODE MODEL-BASED DENGAN METODE K-MEAN DALAM ANALISIS CLUSTER
Keywords: BIC, EM algorithm, K-mean clustering method, model-based clustering method
Abstract
K-mean method is a clustering method in which grouping techniques are based only on distance measure among observed objects, without considering statistical aspects. Model-based clustering is a method that use statistical aspects, as its theoretical basis i.e. probability maximum criterion. This model has several variations with a variety of geometrical characteristics obtained by mean Gauss component. Data partition is conducted by utilizing EM (expectation-maximization) algorithm. Then by using Bayesian Information Criterion (BIC) the best model is obtained. This research aimed to comparing result of grouping methods between model-based clustering and K-mean clustering. The results showed that model-based clustering was more effective in separating overlap groups than K-mean.
Downloads
References
Banfield, J.D. & Raftery, A.E. (1993). Model-based Gaussian and non-Gaussian clustering. Biometrics, 49, 803-821.
Fraley, C. & Raftery, A.E. (1998). How many clusters? Which clustering method? Answers via model-based cluster analysis. The Computer Journal, 41, 578-588.
Fraley, C. & Raftery, A.E. (1999). MCLUST: Software for model-based clustering analysis. Journal of Classifications, 16, 297-306.
Johnson, R.A. & Wichern, D.W. (1998). Applied multivariate statistical analysis, 4 th Edition. New Jersey: Prentice-Hall.
Kass, R.E. & Raftery, A.E. (1995). Bayes Factor. Journal of the American Statistical Association, 90, 773-795.
Mclachlan, G.J. & Basford, K.E. (1988). Mixture models: Inference and applications to clustering, New York: Marcel Dekker.
Siswadi & Suharjo, B. (1999). Analisis eksplorasi data peubah ganda. Bogor: Jurusan Matematika FMIPA IPB. Bogor.