Browse > Article
http://dx.doi.org/10.5808/GI.2020.18.4.e41

An information-theoretical analysis of gene nucleotide sequence structuredness for a selection of aging and cancer-related genes  

Blokh, David (C.D. Technologies Ltd.)
Gitarts, Joseph (Efi Arazi School of Computer Science, Interdisciplinary Center)
Stambler, Ilia (Department of Science, Technology and Society, Bar Ilan University)
Abstract
We provide an algorithm for the construction and analysis of autocorrelation (information) functions of gene nucleotide sequences. As a measure of correlation between discrete random variables, we use normalized mutual information. The information functions are indicative of the degree of structuredness of gene sequences. We construct the information functions for selected gene sequences. We find a significant difference between information functions of genes of different types. We hypothesize that the features of information functions of gene nucleotide sequences are related to phenotypes of these genes.
Keywords
gene sequence; gene structuredness; information function; information theory; normalized mutual information;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Gaudette BT, Dwivedi B, Chitta KS, Poulain S, Powell D, Vertino P, et al. Low expression of pro-apoptotic Bcl-2 family proteins sets the apoptotic threshold in Waldenstrom macroglobulinemia. Oncogene 2016;35:479-490.   DOI
2 Androulakis IP, Yang E, Almon RR. Analysis of time-series gene expression data: methods, challenges, and opportunities. Annu Rev Biomed Eng 2007;9:205-228.   DOI
3 Roman-Roldan R, Bernaola-Galvan P, Oliver JL. Application of information theory to DNA sequence analysis: a review. Pattern Recognit 1996;29:1187-1194.   DOI
4 Singh PP, Demmitt BA, Nath RD, Brunet A. The genetics of aging: a vertebrate perspective. Cell 2019;177:200-220.   DOI
5 Gamow G. Possible mathematical relation between deoxyribonucleic acid and proteins. Biol Meddel Kongel Danske Vidensk Selsk 1954;22:1-13.
6 Scheffe H. The Analysis of Variance. Hoboken: John Wiley & Sons, 1999.
7 Blokh D, Stambler I, Afrimzon E, Shafran Y, Korech E, Sandbank J, et al. The information-theory analysis of Michaelis-Menten constants for detection of breast cancer. Cancer Detect Prev 2007;31:489-498.   DOI
8 Gutierrez Diez PJ, Russo IH, Russo J. The Evolution of the Use of Mathematics in Cancer Research. New York: Springer, 2012.
9 Durbin R, Eddy SR, Krogh A, Mitchison G. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge: Cambridge University Press, 1998.
10 Ran LK, Chen Y, Zhang ZZ, Tao NN, Ren JH, Zhou L, et al. SIRT6 overexpression potentiates apoptosis evasion in hepatocellular carcinoma via BCL2-associated X protein-dependent apoptotic pathway. Clin Cancer Res 2016;22:3372-3382.   DOI
11 Romagosa C, Simonetti S, Lopez-Vicente L, Mazo A, Lleonart ME, Castellvi J, et al. p16(Ink4a) overexpression in cancer: a tumor suppressor gene associated with senescence and high-grade tumors. Oncogene 2011;30:2087-2097.   DOI
12 Saha S, Mandal P, Ganguly S, Jana D, Ayaz A, Banerjee A, et al. Decreased expression of BRCA2 accelerates sporadic breast cancer progression. Indian J Surg Oncol 2015;6:378-383.   DOI
13 Roy R, Chun J, Powell SN. BRCA1 and BRCA2: different roles in a common pathway of genome protection. Nat Rev Cancer 2011;12:68-78.
14 Conover WJ. Practical Nonparametric Statistics. New York: Wiley-Interscience, 1999.
15 Li W. Mutual information functions versus correlation functions. J Stat Phys 1990;60:823-837.   DOI
16 Atzmon G. Longevity Genes: A Blueprint for Aging. New York: Springer, 2015.
17 Glantz SA. Primer of Biostatistics. 4th ed. New York: McGraw-Hill, 1994.
18 Joosse SA. BRCA1 and BRCA2: a common pathway of genome protection but different breast cancer subtypes. Nat Rev Cancer 2012;12:372.   DOI
19 Shan YS, Hsu HP, Lai MD, Hung YH, Wang CY, Yen MC, et al. Cyclin D1 overexpression correlates with poor tumor differentiation and prognosis in gastric cancer. Oncol Lett 2017;14:4517-4526.   DOI
20 Mencke R, Olauson H, Hillebrands JL. Effects of Klotho on fibrosis and cancer: a renal focus on mechanisms and therapeutic strategies. Adv Drug Deliv Rev 2017;121:85-100.   DOI
21 Shimizu Y, Luk H, Horio D, Miron P, Griswold M, Iglehart D, et al. BRCA1-IRIS overexpression promotes formation of aggressive breast cancers. PLoS One 2012;7:e34102.   DOI
22 Jin X, Wei Y, Xu F, Zhao M, Dai K, Shen R, et al. SIRT1 promotes formation of breast cancer through modulating Akt activity. J Cancer 2018;9:2012-2023.   DOI
23 Bosch-Presegue L, Vaquero A. The dual role of sirtuins in cancer. Genes Cancer 2011;2:648-662.   DOI
24 Yockey HP, Platzman RL, Quastler H. Symposium on Information Theory in Biology, 1956 Oct 29-31, Gatlinburg, Tennessee. New York: Pergamon Press, 1958.
25 Blokh D, Stambler I, Afrimzon E, Platkov M, Shafran Y, Korech E, et al. Comparative analysis of cell parameter groups for breast cancer detection. Comput Methods Programs Biomed 2009; 94:239-249.   DOI
26 Wong KC. Big data challenges in genome informatics. Biophys Rev 2019;11:51-54.   DOI
27 Masoller C, Hong Y, Ayad S, Gustave F, Barland S, Pons AJ, et al. Quantifying sudden changes in dynamical systems using symbolic networks. New J Phys 2015;17:023068.   DOI
28 James BT, Luczak BB, Girgis HZ. MeShClust: an intelligent tool for clustering DNA sequences. Nucleic Acids Res 2018;46:e83.   DOI
29 Priness I, Maimon O, Ben-Gal I. Evaluation of gene-expression clustering via mutual information distance measure. BMC Bioinformatics 2007;8:111.   DOI
30 Song L, Langfelder P, Horvath S. Comparison of co-expression measures: mutual information, correlation, and model based indices. BMC Bioinformatics 2012;13:328.   DOI
31 Gelfand IM. Speech at the meeting of Royal East Research, September 3, 2003. Matematicheskoe Prosveshenie 2004;3:13-14.
32 Blokh D, Stambler I. The application of information theory for the research of aging and aging-related diseases. Prog Neurobiol 2017;157:158-173.   DOI
33 Blokh D, Stambler I, Lubart E, Mizrahi EH. An information theory approach for the analysis of individual and combined evaluation parameters of multiple age-related diseases. Entropy 2019;21:572.   DOI
34 Blokh D, Zurgil N, Stambler I, Afrimzon E, Shafran Y, Korech E, et al. An information-theoretical model for breast cancer detection. Methods Inf Med 2008;47:322-327.   DOI
35 Vinga S. Information theory applications for biological sequence analysis. Brief Bioinform 2014;15:376-389.   DOI