Accessibility navigation


Sensitivity analysis for deep learning: ranking hyper-parameter influence

Taylor, R., Ojha, V. ORCID: https://orcid.org/0000-0002-9256-1192, Martino, I. and Nicosia, G. (2021) Sensitivity analysis for deep learning: ranking hyper-parameter influence. In: 33rd IEEE International Conference on Tools with Artificial Intelligence (ICTAI2021), 1-3 NOV 2021, Online, https://doi.org/10.1109/ICTAI52525.2021.00083.

[img] Text - Accepted Version
· Restricted to Repository staff only
· The Copyright of this document has not been checked yet. This may affect its availability.

286kB

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

To link to this item DOI: 10.1109/ICTAI52525.2021.00083

Abstract/Summary

We propose a novel approach to ranking Deep Learning (DL) hyper-parameters through the application of Sensitivity Analysis (SA). DL hyper-parameters play an important role in model accuracy however, choosing optimal values for each parameter can be time and resource-intensive. A better understanding of the importance of parameters in relation to data and model architecture would benefit DL research. SA provides a quantitative measure by which hyper-parameters can be ranked in terms of their contribution to model accuracy allowing comparisons to be made across datasets and architecture. The results showed the importance of optimal architecture with activation function being highly influential. The influence of learning rate decay was ranked highest, with model performance being sensitive to this parameter regardless of architecture or dataset. The influence of a model's initial learning rate was proven to be low, contrary to the literature. The results also showed that the importance of a parameter is closely linked to model architecture. Shallower models showed susceptibility to hyper-parameters affecting the stochasticity of the learning process whereas deeper models showed sensitivity to hyper-parameters affecting the convergence speed. Additionally, the complexity of the dataset can affect the margin of separation between the sensitivity measures of the most and the least influential parameters, making the most influential hyper-parameter an ideal candidate for tuning compared to the other parameters.

Item Type:Conference or Workshop Item (Paper)
Refereed:Yes
Divisions:Interdisciplinary Research Centres (IDRCs) > Centre for the Mathematics of Planet Earth (CMPE)
Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science
ID Code:100199

University Staff: Request a correction | Centaur Editors: Update this record

Page navigation