Accessibility navigation


Statistical study design for analyzing multiple gene loci correlation in DNA sequences

Kamoljitprapa, P. ORCID: https://orcid.org/0000-0002-5547-7354, Baksh, F. M. ORCID: https://orcid.org/0000-0003-3107-8815, De Gaetano, A. ORCID: https://orcid.org/0000-0001-7712-056X, Polsen, O. and Leelasilapasart, P. ORCID: https://orcid.org/0000-0002-0198-9944 (2023) Statistical study design for analyzing multiple gene loci correlation in DNA sequences. Mathematics, 11 (23). 4710. ISSN 2227-7390

[img]
Preview
Text (Open Access) - Published Version
· Available under License Creative Commons Attribution.
· Please see our End User Agreement before downloading.

2MB

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

To link to this item DOI: 10.3390/math11234710

Abstract/Summary

This study presents a novel statistical and computational approach using nonparametric regression, which capitalizes on correlation structure to deal with the high-dimensional data often found in pharmacogenomics, for instance, in Crohn’s inflammatory bowel disease. The empirical correlation between the test statistics, investigated via simulation, can be used as an estimate of noise. The theoretical distribution of −log10(p-value) is used to support the estimation of that optimal bandwidth for the model, which adequately controls type I error rates while maintaining reasonable power. Two proposed approaches, involving normal and Laplace-LD kernels, were evaluated by conducting a case-control study using real data from a genome-wide association study on Crohn’s disease. The study successfully identified single nucleotide polymorphisms on the NOD2 gene associated with the disease. The proposed method reduces the computational burden by approximately 33% with reasonable power, allowing for a more efficient and accurate analysis of genetic variants influencing drug responses. The study contributes to the advancement of statistical methodology for analyzing complex genetic data and is of practical advantage for the development of personalized medicine.

Item Type:Article
Refereed:Yes
Divisions:Science > School of Mathematical, Physical and Computational Sciences > Department of Mathematics and Statistics
ID Code:114194
Uncontrolled Keywords:General Mathematics, Engineering (miscellaneous), Computer Science (miscellaneous)
Publisher:MDPI AG

Downloads

Downloads per month over past year

University Staff: Request a correction | Centaur Editors: Update this record

Page navigation