Accessibility navigation


Novel parameter-free and parametric same degree distribution-based dimensionality reduction algorithms for trustworthy data structure preserving

Hajderanj, L. ORCID: https://orcid.org/0009-0007-0445-3049, Chen, D., Dudley, S., Gilloppe, G. and Sivy, B. (2024) Novel parameter-free and parametric same degree distribution-based dimensionality reduction algorithms for trustworthy data structure preserving. Information Sciences, 661. 120030. ISSN 1872-6291

Full text not archived in this repository.

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

To link to this item DOI: 10.1016/j.ins.2023.120030

Abstract/Summary

As an effective dimensionality reduction method, Same Degree Distribution (SDD) has been demonstrated to be able to maintain better data structure than other dimensionality reduction methods, including Principal Component Analysis (PCA), Multidimensional Scaling (MDS), Isomap, Locally Linear Embedding (LLE), Laplacian Eigen-maps (LE), Uniform Manifold Approximation and Projection (UMAP) and t-Stochastic Neighbour Embedding (t-SNE). In addition, SDD does not require tuning the number of neighbours or perplexity to scale the structure capturing performance. Instead, it requires tuning the degree of degree-distribution ranging in e certain interval. Hence, tuning the degree of degree-distribution makes SDD a less costly method than other methods that require tuning the number of neighbours or perplexity. Although these advantages, SDD is still an expensive method compared with parameter-free methods such as PCA and MDS. A parameter-free SDD is proposed based on standard SDD, with two main differences: 1) it does not require tuning the degree of degree-distribution in the entire range from 1 to 15, but only uses degree 1; and 2) it re-scales the pairwise distances in the range [0, 2] instead of range [0, 1]. A theoretical analysis is presented to prove the better performance of parameter-free SDD. In addition, the performances of the proposed parameter-free SDD and the standard SDD have been experimentally compared in terms of structure capturing and computational time. This paper also proposes a parametric version of SDD using a deep neural network approach to learn the mapping based on the samples of the original data and their corresponding embedded representations in a low dimensional space. Comparative experiments have been undertaken with SDD and other methods such as Isomap, t-SNE and UMAP to demonstrate the effectiveness of the proposed parametric SDD with several popular synthetic and real datasets such as Churn, SEER Breast Cancer, AVletters (LIPS Reading) and MNIST.

Item Type:Article
Refereed:Yes
Divisions:Henley Business School > Digitalisation, Marketing and Entrepreneurship
ID Code:122813
Publisher:Elsevier

University Staff: Request a correction | Centaur Editors: Update this record

Page navigation