Entropy of DNA sequences and leukemia patients mortality
Abstract
Introduction. Deoxyribonucleic acid (DNA) is not a random sequence of four nucleotides combinations: comprehensive reviews [1, 2] persuasively shows long- and short-range correlations in DNA, periodic properties and correlations structure of sequences. Information theory methods, like Entropy, imply quantifying the amount of information contained in sequences. the relationship between entropy and patient survival is widespread in some branches of medicine and medical researches: cardiology, neurology, surgery, trauma. Therefore, it appears there is a necessity for implementing advantages of information theory methods for exploration of relationship between mortality of some category of patients and entropy of their DNA sequences. Aim of the research. The goal of this paper is to provide a reliable formula for calculating entropy accurately for short DNA sequences and to show how to use existing entropy analysis to examine the mortality of leukemia patients. Materials and Methods. We used University of Barcelona (UB) leukemia patient’s data base (DB) with 117 anonymized records that consists: Date of patient’s diagnosis, Date of patient’s death, Leukemia diagnoses, Patient’s DNA sequence. Average time for patient death after diagnoses: 99 ± 77 months. The formal characteristics of DNA sequences in UB leukemia patient’s DB are: average number of bases N = 496 ± 69; min (N) = 297 bases; max(N) = 745 bases. The generalized form of the Robust Entropy Estimator (EnRE) for short DNA sequences was proposed and key EnRE futures was showed. The Survival Analysis has been done using statistical package IBM SPSS 27 by Kaplan-Meier survival analysis and Cox Regressions survival modelling. Results. The accuracy of the proposed EnRE for calculating entropy was proved for various lengths of time series and various types of random distributions. It was shown, that in all cases for N = 500, relative error in calculating the precise value of entropy does not exceed 1 %, while the magnitude of correlation is no worse than 0.995. In order to yield the minimum EnRE standard deviation and coefficient of variation, an initial DNA sequence's alphabet code was converted into an integer code of bases using an optimization rule for only one minimal numerical decoding around zero. Entropy EnRE were calculated for leukemia patients for two samples: 2 groups divided by median EnRE = 1.47 and 2 groups of patients were formed according to their belonging to 1st (EnRE ≤ 1.448) and 4th (EnRE ≥ 1.490) quartiles. The result of Kaplan-Meier survival analysis and Cox Regressions survival modelling are statistically significant: p < 0,05 for median groups and p < 0,005 for patient’s groups formed of 1st and 4th quartiles. The death hazard for a patient with EnRE below median is 1.556 times that of a patient with EnRE over median and that the death hazard for a patient of 1st entropy quartile (lowest EnRE) is 2.143 times that of a patient of 4th entropy quartile (highest EnRE). Conclusions. The transition from widen (median) to smaller (quartile) patients’ groups with more EnRE differentiation confirmed the unique significance of the entropy of DNA sequences for leukemia patient’s mortality. This significance is proved statistically by increasing hazard and decreasing of average time of death after diagnoses for leukemia patients with lower entropy of DNA sequences.
Downloads
References
Li WT. The study of correlation structures of DNA sequences: a critical review. Comput. Chem. 1997; 21 (4): 257–271. DOI: https://doi.org/10.1016/s0097-8485(97)00022-3
Damasevicius R. Complexity estimation of genetic sequences using information-theoretic and frequency analysis methods. Informatica. 2010; 21 (1): 13–30. DOI: https://doi.org/10.15388/Informatica.2010.270
Rowe GW, Trainor LEH. On the informational content of viral DNA. J. Theoretical Biology. 1983; 101: 151–170. DOI: https://doi.org/10.1016/0022-5193(83)90332-6
Vopson MM, Robson SC. A new method to study genome mutations using the information entropy. Physica A. 2012;1-9. DOI: https://doi.org/10.1016/j.physa.2021.126383
Sherwin WB. Entropy and Information Approaches to Genetic Diversity and its Expression: Genomic Geography. Entropy. 2010;12:1765-1798. DOI: https://doi.org/10.3390/e12071765
Chanda P, Costa E, Hu J, Sukumar S, Van Hemert J, Walia R. Information Theory in Computational Biology: Where We Stand Today. Entropy. 2020;22:627-637. DOI: https://doi.org/10.3390/e22060627
Villareal RP, Liu BC, Massumi A. Heart rate variability and cardiovascular mortality. Curr Atheroscler Rep. 2002; 4: 120–127. DOI: https://doi.org/10.1007/s11883-002-0035-18
Rodríguez J, Correa C, Ramírez L. Heart dynamics diagnosis based on entropy proportions: Application to 550 dynamics. Revista Mexicana de Cardiología. 2017; 28 (1): 10–20.
Androulakis AFA, Zeppenfeld K, Paiman EHM, Piers SRD, Wijnmaalen AP, Siebelink HJ, Sramko M, Lamb HJ, van der Geest RJ, de Riva M, Tao Q. Entropy as a Novel Measure of Myocardial Tissue Heterogeneity for Prediction of Ventricular Arrhythmias and Mortality in Post-Infarct Patients. JACC Clin Electrophysiol. 2019 Apr;5 (4): 480–489. DOI: https://doi.org/10.1016/j.jacep.2018.12.005. Epub 2019 Feb 27. PMID: 31000102.
Sykora M, Szabo J, Siarnik P, Turcani P, Krebs S, Lang W, Czosnyka M, Smielewski P. Heart rate entropy is associated with mortality after intracereberal hemorrhage. Journal of the Neurological Sciences. 2020: 418: 117033, ISSN 0022-510X, 1–5: DOI: https://doi.org/10.1016/j.jns.2020.117033
Matsuda E. Entropy Monitoring in Patients Undergoing General Anesthesia. Am J Nurs. 2017 Mar;117(3):62. DOI: https://doi.org/10.1097/01.NAJ.0000513290.22001.8d
Neal-Sturgess C. The Entropy of Morbidity Trauma and Mortality. Arxiv Cornell University. Med. Physics. 2010; 1–20. DOI: https://doi.org/10.48550/arxiv.1008.3695
Norris PR, Anderson SM, Jenkins JM, Williams AE, Morris JAJr. Heart rate multiscale entropy at three hours predicts hospital mortality in 3,154 trauma patients. Shock. 2008 Jul; 30 (1): 17–22. DOI: https://doi.org/10.1097/SHK.0b013e318164e4d0
Papaioannou VE, Chouvarda IG, Maglaveras NK, Baltopoulos GI, Pneumatikos IA. Temperature multiscale entropy analysis: a promising marker for early prediction of mortality in septic patients. Physiol Meas. 2013 Nov;34(11):1449-66. DOI: https://doi.org/10.1088/0967-3334/34/11/1449
Weir BS. Statistical analysis of molecular genetic data. IMA J. of Math. Applied in Medicine and Biology. 1985; 2:1–39.
Shannon CE. A Mathematical Theory of Communication. Bell System Technical Journal. 1948; 27 (3): 379–423. DOI: https://doi.org/10.1002/j.1538-7305.1948.tb01338.x
Lazo A, Rathie P. On the entropy of continuous probability distributions. IEEE Transactions on Information Theory. 1978;24(1). DOI: https://doi.org/10.1109/TIT.1978.1055832
Gini C, Ottaviani G. Università di Roma. Memorie Di Metodologia Statistica. Roma: E.V. Veschi; 1955.
Sánchez-Hechavarría M.E. and etc. Introduction of Application of Gini Coefficient to Heart Rate Variability Spectrum for Mental Stress Evaluation. Arq Bras Cardiol. 2019; [online].ahead print, PP.0-0. DOI: https://doi.org/10.5935/abc.20190185
Firebaugh G. Empirics of World Income Inequality. American Journal of Sociology. 1999; 104 (6): 597–1630. DOI: https://doi.org/10.1086/210218
Shorrocks AF. The Class of Additively Decomposable Inequality Measures. Econometrica. 1980; 48 (3): 613–625. DOI: https://doi.org/10.2307/1913126
Martynenko A, Raimondi G, Budreiko N. Robust Entropy Estimator for Heart Rate Variability. Klin. Inform. Telemed. 2019; 14 (15): 67–73. DOI: https://doi.org/10.31071/kit2019.15.06
The Journal of V. N. Karazin Kharkiv National University, series Medicine has following copyright terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work’s authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal’s published version of the work, with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.