DOI QR코드

DOI QR Code

Distribution of Runs and Patterns in Four State Trials

  • Jungtaek Oh (Department of Biomedical Science, School of Medicine, Kyungpook National University, Clinical Omics Center, School of Medicine, Kyungpook National University, The Institute of Industrial Technology, Changwon National University)
  • Received : 2024.01.17
  • Accepted : 2024.03.11
  • Published : 2024.06.30

Abstract

From the mathematical and statistical point of view, a segment of a DNA strand can be viewed as a sequence of four-state (A, C, G, T) trials. Herein, we consider the distributions of runs and patterns related to the run lengths of multi-state sequences, especially for four states (A, B, C, D). Let X1, X2, . . . be a sequence of four state independent and identically distributed trials taking values in the set 𝒢 = {A, B, C, D}. In this study, we obtain exact formulas for the probability distribution function for the discrete distribution of runs of B's of order k. We obtain longest run statistics, shortest run statistics, and determine the distributions of waiting times and run lengths.

Keywords

Acknowledgement

The author would like to thank Dr. habil. Tommy Rene Jensen whose comments led to significant improvements in this manuscript. This research was partially supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education(NRF-2019R1A6A3A01090663, NRF-2020R1I1A1A01057479) and partially supported under the framework of international cooperation program managed by the National Research Foundation of Korea(NRF-2022K2A9A1A01098016).

References

  1. S. Aki, Waiting time problems for a sequence of discrete random variables, Ann. Inst. Statist. Math., 44(2)(1992), 363-378. 
  2. S. Aki, Waiting time for consecutive repetitions of a pattern and related distributions, Ann. Inst. Statist. Math., 71(2)(2019), 307-325. 
  3. D. L. Antzoulakos, On waiting time problems associated with runs in Markov dependent trials, Ann. Inst. Statist. Math., 51(2)(1999), 323-330. 
  4. D. L. Antzoulakos and A. N. Philippou, Probability distribution functions of succession quotas in the case of Markov dependent trials, Ann. Inst. Statist. Math., 49(3)(1997), 531-539. 
  5. A. N. Arapis, F. S. Makri and Z. M. Psillakis, Distributions of statistics describing concentration of runs in non homogeneous Markov-dependent trials, Comm. Statist. Theory Methods, 47(9)(2018), 2238-2250. 
  6. N. Balakrishnan and M. V. Koutras, Runs and scans with applications John Wiley & Sons, New York, 2003. 
  7. K. Balasubramanian, R. Viveros and N. Balakrishnan, Sooner and later waiting time problems for Markovian Bernoulli trials, Stat. Probab. Lett., 18(2)(1993), 153-161. 
  8. G. J. Chang, L. Cui and F. K. Hwang, Reliabilities of consecutive-k systems, Springer Science & Business Media, 2000. 
  9. M. T. Chao, J. C. Fu, and M. V. Koutras, Survey of reliability studies of consecutive-k-out-of-n: F and related systems, IEEE Trans. Reliab., 44(1)(1995), 120-127. 
  10. M. Ebneshahrashoob and M. Sobel, Sooner and later waiting time problems for Bernoulli trials: frequency and run quotas, Stat. Probab. Lett., 9(1)(1990), 5-11. 
  11. S. Eryilmaz, Review of recent advances in reliability of consecutive k-out-of-n and related systems, Proceedings of the Institution of Mechanical Engineers, Part O: Journal of Risk and Reliability, 224(3)(2010), 225-237. 
  12. S. Eryilmaz, On success runs in a sequence of dependent trials with a change point, Stat. Probab. Lett., 132(2018), 91-98. 
  13. W. Feller, An introduction to probability theory and its applications Vol. 1, John Wiley & Sons, New York, 1968. 
  14. J. C. Fu and W. W. Lou, Distribution theory of runs and patterns and its applications: a finite Markov chain imbedding approach, World Scientific, 2003. 
  15. K. Inoue and S. Aki, On sooner and later waiting time distributions associated with simple patterns in a sequence of bivariate trials, Metrika, 77(7)(2014), 895-920. 
  16. S. Kim, C. Park and J. Oh, On waiting time distribution of runs of ones or zeros in a Bernoulli sequence, Stat. Probab. Lett., 83(1)(2013), 339-344. 
  17. Y. Kong, Joint distribution of rises, falls, and number of runs in random sequences, Comm. Statist. Theory Methods, 48(3)(2019), 493-499. 
  18. M. V. Koutras, On a waiting time distribution in a sequence of Bernoulli trials, Ann. Inst. Statist. Math., 48(4)(1996), 789-806. 
  19. W. Kuo and M. J. Zuo, Optimal reliability modeling: principles and applications, John Wiley & Sons, New York, 2003. 
  20. F. S. Makri, Z. M. Psillakis and A. N. Arapis, we, J. Appl. Stat., 46(1)(2019), 85-100. 
  21. A. M. Mood, The distribution theory of runs, Ann. Math. Statist., 11(4)(1940), 367-392. 
  22. A. N. Philippou, The negative binomial distribution of order k and some of its properties, Biom. J., 26(7)(1984), 789-794. 
  23. A. N. Philippou and A. A. Muwafi, Waiting for the kth consecutive success and the fibonacci sequence of order k, Fibonacci Quart, 20(1)(1982), 28-32. 
  24. A. N. Philippou, C. Georghiou and G. N. Philippou, A generalized geometric distribution and some of its properties, Stat. Probab. Lett., 1(4)(1983), 171-175. 
  25. A. N. Philippou and F. S. Makri, Successes, runs and longest runs, Stat. Probab. Lett., 4(4)(1986), 101-105. 
  26. K. Sen, M. L. Agarwal and S. Chakraborty, Lengths of runs and waiting time distributions by using Polya-Eggenberger sampling scheme, Studia Sci. Math. Hungar., 39(3-4)(2002), 309-332. 
  27. I. S. Triantafyllou, Consecutive-type reliability systems: an overview and some applications, J. Qual. Reliab. Eng., 2015(1)(2015), 1-20.