An Efficient Approach to Mining Maximal Contiguous Frequent Patterns from Large DNA Sequence Databases |
Karim, Md. Rezaul
(Department of Computer Engineering, College of Electronics and Information, Kyung Hee University)
Rashid, Md. Mamunur (Department of Computer Engineering, College of Electronics and Information, Kyung Hee University) Jeong, Byeong-Soo (Department of Computer Engineering, College of Electronics and Information, Kyung Hee University) Choi, Ho-Jin (Department of Computer Science, Korea Advanced Institute of Science and Technology) |
1 | Chvatal V, Sankoff D. Longest common subsequences of two random sequences. J Appl Probab 1975;12:306-315. DOI ScienceOn |
2 | Hirschberg DS. Algorithms for the longest common subsequence problem. J Assoc Comput Mach 1977;24:664-675. DOI |
3 | Huo H, Stojkovic V. A suffix tree construction algorithm for DNA sequences. In: Proceeding of IEEE International Conference on Bioinformatics and Bioengineering (BIBE'07), 2007 Oct 14-17, Boston, MA, pp. 1178-1182. |
4 | Tata S, Hankins RA, Patel JM. Practical suffix tree construction. In: Proceeding of 30th International Conference on Very Large Data Bases (VLDB'04), 2004 Aug 29-Sep 3, Toronto, pp. 36-47. |
5 | Agrawal R, Srikant R. Fast algorithms for mining association rules. In: Proceeding of 20th International Conference on Very Large Data Bases (VLDB'94), 1994 Sep 12-15, Santiago de Chile, pp. 487-499. |
6 | Srikant R, Agrawal R. Mining sequential patterns: generalizations and performance improvements. In: Proceeding of 5th International Conference on Extending Database Technology (EDBT'96), 1996 Mar 25-29, Avignon, pp. 3-17. |
7 | Pei J, Han J, Mortazavi-Asl B, Chen Q, Dayal U, Hsu MC. PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceeding of IEEE International Conference on Data Engineering (ICDE'01), 2001 Apr 2-6, Heidelberg, pp. 215-224. |
8 | Pan J, Wang P, Wang W, Shi B, Yang G. Efficient algorithms for mining maximal frequent concatenate sequences in biological datasets. In: Proceeding of 5th International Conference on Computer and Information Technology (CIT'05), 2005 Sep 21-23, Shanghai, pp. 98-104. |
9 | Kang TH, Yoo JS, Kim HY. Mining frequent contiguous sequence patterns in biological sequences. In: Proceeding of 7th IEEE International Conference on Bioinformatics and Bioengineering (BIBE'08), 2008 Oct 8-10, Athens, pp. 723-728. |
10 | Zerin SF, Ahmed CF, Tanbeer SK, Jeong BS. A fast indexed- based contiguous sequential pattern mining technique in biological data sequences. In: Proceeding of 2nd International Conference on Emerging Databases (EBD'10), 2010 Aug 30-31, Jeju. |
11 | Appice A, Ceci M, Turi A, Malerba D. A parallel, distributed algorithm for relational frequent pattern discovery from very large data sets. Intell Data Anal 2011;15:69-88. |
12 | Lin MY, Lee SY. Fast discovery of sequential patterns through memory indexing and database partitioning. J Inf Sci Eng 2005;21:109-128. |
13 | Nguyen SN, Orlowska ME. A further study in the data partitioning approach for frequent itemsets mining. In: Proceeding of 17th Australasian Database Conference (ADC'06), 2006 Jan 16-19, Hobart, Tasmania, pp. 31-37. |
14 | Totad SG, Geeta RB, Prasanna CR, Santhosh NK, Reddy PV. Scaling data mining algorithms to large and distributed datasets. Intl J Database Manag Syst 2010; 2:26-35. DOI |