Volume: 23, Issue: 4(2009)
pp. 721-743 DOI: 10.1142/S0218001409007338
|
|
Abstract |
Full Text - Free Access (PDF, 1,214KB)
|
References
|
 |
| Title: |
K-MEANS CLUSTERING FOR PROBLEMS WITH PERIODIC ATTRIBUTES |
| Author(s): |
M. VEJMELKA
Address for correspondence: Department of Cybernetics, Czech Technical University, Technicka 2, 166 27 Prague 6, Czech Republic. Institute of Computer Science, Academy of Sciences of the Czech Republic, Pod Vodarenskou vezi 2, 182 07 Prague 8, Czech RepublicP. MUSILEK Department of Electrical and Computer Engineering, University of Alberta, W2-030 ECERF, Edmonton, Alberta, T6G 2V4, CanadaM. PALUŠ Institute of Computer Science, Academy of Sciences of the Czech Republic, Pod Vodarenskou vezi 2, 182 07 Prague 8, Czech RepublicE. PELIKÁN Institute of Computer Science, Academy of Sciences of the Czech Republic, Pod Vodarenskou vezi 2, 182 07 Prague 8, Czech Republic
|
| Abstract: |
The K-means algorithm is very popular in the machine learning community due to its inherent simplicity. However, in its basic form, it is not suitable for use in problems which contain periodic attributes, such as oscillator phase, hour of day or directional heading. A commonly used technique of trigonometrically encoding periodic input attributes to artificially generate the required topology introduces a systematic error. In this paper, a metric which induces a conceptually correct topology for periodic attributes is embedded into the K-means algorithm. This requires solving a non-convex minimization problem in the maximization step. Results of numerical experiments comparing the proposed algorithm to K-means with trigonometric encoding on synthetically generated data are reported. The advantage of using the proposed K-means algorithm is also shown on a real example using gas load data to build simple predictive models. |
| Keywords: |
Clustering algorithms; similarity measures; K-means; periodic attributes
|
|
|