Conoce la Facultad de Informática

(Beste leiho bat zabalduko du) Ezagutu

Ezagutu UPV/EHUko Informatika Fakultatea

Fakultatea da erreferentziazko ikastegia informatikako eta adimen artifizialeko prestakuntza eta ezagutza teknikoa/zientifikoa jasotzeko.

Ezagutu (Beste leiho bat zabalduko du) Kokapena eta kontaktua (Beste leiho bat zabalduko du)

Nested Applications

Nabarmentzekoa - GrAL

Nabarmentzekoa - MUGIKORTASUNA

Nabarmentzekoa - ENPRESAK




Sally El Hajjar”Contribution to Graph – based Multi-view Clustering:

Algorithms and Applicattions”.

Zuzendariak_Directores: Fadi Dornaika

2022_12_09, 10:30 : Sala Ada Lovelace aretoa.



"In this thesis, we study unsupervised learning, specifically, clustering methods for dividing data into meaningful groups. A major challenge is to find an efficient algorithm with low computational complexity that can handle different types and sizes of data sets.

For this purpose, we propose two approaches. The first approach is named "Multi-view Clustering via Kernelized Graph and Nonnegative Embedding" (MKGNE), and the second approach is called "Multi-view Clustering via Consensus Graph Learning and Nonnegative Embedding" (MVCGE). These two approaches jointly solve four tasks. They jointly estimate the unified similarity matrix over all views using the kernel tricks,  the unified spectral projection of the data, the cluster indicator matrix, and the weight of each view without additional parameters. With these two approaches,  there  is  no  need  for   any postprocessing  such  as  k-means clustering.

In a further study, we propose a method named "Multi-view Spectral Clustering via Constrained Nonnegative Embedding" (CNESE). This method can overcome the drawbacks of the spectral clustering approaches, since they only provide a nonlinear projection of the data, on which an additional step of clustering is required. This can degrade the quality of the final clustering due to various factors such as the initialization process or outliers. Overcoming these drawbacks can be done by introducing a nonnegative embedding matrix which gives the final clustering assignment. In addition, some constraints are added to the targeted matrix to enhance the clustering performance.

In accordance with the above methods, a new method called "Multi-view Spectral Clustering with a self-taught Robust Graph Learning" (MCSRGL) has been developed. Different from other approaches, this method integrates two main paradigms into the one-step multi-view clustering model. First, we construct an additional graph by using the cluster label space in addition to the graphs associated with the data space. Second, a smoothness constraint is exploited to constrain the cluster-label matrix and make it more consistent with the data views and the label view.

Moreover, we propose two unified frameworks for multi-view clustering in Chapter 9.  In these framework, we attempt to determine a view based graphs, the consensus graph, the consensus  spectral representation, and the soft clustering assignments. These methods retain the main advantages of the aforementioned methods and integrate the concepts of consensus and unified matrices. By using the unified matrices, we enforce the matrices of different views to be similar, and thus the problem of noise and inconsistency between different views will be reduced.

Extensive experiments were conducted on several public datasets with different types and sizes, varying from face image datasets, to document datasets, handwritten datasets, and synthetics datasets. We provide several analyses of the proposed algorithms, including ablation studies, hyper-parameter sensitivity analyses, and computational costs. The experimental results show that the developed algorithms through this thesis are relevant and outperform several competing methods.

Keywords: Machine learning, unsupervised learning, multi-view clustering, graph learning, spectral projection, nonnegative embedding, auto-weighted strategy, clustering algorithms,  similarity graph, graph construction,  soft cluster assignments, cluster label space,  consensus matrices, constrained nonnegative embedding, smoothness constraints."