A semi-supervised clustering approach using nonlinear canonical correlation analysis with t-SNEHong, X. ORCID: https://orcid.org/0000-0002-6832-2298, Xiao, J. and Wei, H. ORCID: https://orcid.org/0000-0002-9664-5748 (2024) A semi-supervised clustering approach using nonlinear canonical correlation analysis with t-SNE. In: The International Joint Conference on Neural Networks (IJCNN) 2024, 30 Jun- 5 Jul 2024, Yokohoma, Japan. (In Press)
It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing. Abstract/SummaryClustering of high-dimensional data is a challenging task, since the usual distance measures in high-dimensional space cannot reflect how clusters are partitioned. In this work, by assuming there are some data examples with known labels, a new semi-supervised clustering approach is proposed using a modified canonical correlation analysis and t-SNE. Initially, t-SNE projects high dimensional data onto 3D embedding. While the clusters in the t-SNE embedding space may be visually separable, it is still challenging to achieve very good clustering performance with a conventional unsupervised clustering algorithm. In this work, by using radial basis functions (RBFs) in t-SNE embedding space, centred as some labelled points, a modified canonical correlation analysis algorithm is introduced. The proposed algorithm is referred to as RBF-CCA, which learns the associated projection matrix using supervised learning on the small labelled data set, followed by projection of the associated canonical variables to a large amount of unlabelled data. Then, k-means clustering is applied as the final clustering step. To demonstrate its effectiveness, the proposed algorithm is experimented on several benchmark image data sets.
Deposit Details University Staff: Request a correction | Centaur Editors: Update this record |