An Extensive Comparative Study of Multi-view Clustering

No teaser available.

Completed Master Thesis

Clustering is one of the essential tools of data mining which consists of regrouping similar objects in the same classes (clusters). Multi-view clustering (MvC) is one of the interesting areas of unsupervised learning that has received much attention recently. It consists of cluster data from diverse sources or domains, where each object is described by several sets of features (or views). MvC approaches are used in several applications such as text and image clustering. The advantage of MvC is incorporating the local invariance within every single view and the consistency across different views leading to a consensus clustering through all views. The objective of this Master thesis is to carry out an extensive comparative study on MvC approaches considering a concise experimental methodology including:  

  • Datasets collection
     
  • MvC approaches categorization: create a comparison table in terms of:
     
    • Input data structure 
    • Algorithm type (matrix decomposition, probabilistic, graph-based, etc.) 
    • Number of hyperparameters  
    • etc

  • Experimental study considering:
    • Data type (continue, binary, categorical, count data) 
    • Initialization approaches 
    • Hyperparameters selection  
    • Convergence criterion
    • Clustering measurement quality
  • Results analysis: Perspective about the specificity of the algorithms that outperform the state of the art and what is the direction for future research work concerning MvC. 

Supervisors

Contact Person

To the top of the page