Cluster ensemble selection and consensus clustering: A multi-objective optimization approach


Aktaş D., Lokman B., İNKAYA T., Dejaegere G.

European Journal of Operational Research, vol.314, no.3, pp.1065-1077, 2024 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 314 Issue: 3
  • Publication Date: 2024
  • Doi Number: 10.1016/j.ejor.2023.10.029
  • Journal Name: European Journal of Operational Research
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, International Bibliography of Social Sciences, ABI/INFORM, Applied Science & Technology Source, Business Source Elite, Business Source Premier, Compendex, Computer & Applied Sciences, EconLit, INSPEC, Public Affairs Index, zbMATH, Civil Engineering Abstracts
  • Page Numbers: pp.1065-1077
  • Keywords: Cluster ensembles, Consensus clustering, Ensemble selection, Multiple objective programming
  • Bursa Uludag University Affiliated: Yes

Abstract

Cluster ensembles have emerged as a powerful tool to obtain clusters of data points by combining a library of clustering solutions into a consensus solution. In this paper, we address the cluster ensemble selection problem and design a multi-objective optimization-based solution framework to produce consensus solutions. Given a library of clustering solutions, we first design a preprocessing procedure that measures the agreement of each clustering solution with the other solutions and eliminates the ones that may mislead the process. We then develop a multi-objective optimization algorithm that selects representative clustering solutions from the preprocessed library with respect to size, coverage, and diversity criteria and combines them into a single consensus solution, for which the true number of clusters is assumed to be unknown. We conduct experiments on different benchmark data sets. The results show that our approach yields more accurate consensus solutions compared to full-ensemble and the existing approaches for most data sets. We also present an application on the customer segmentation problem, where our approach is used to segment customers and to find a consensus solution for each segment, simultaneously.