Browse > Article

A proper folder recommendation technique using frequent itemsets for efficient e-mail classification  

Moon, Jong-Pil (KT Innotz)
Lee, Won-Suk (Dept. of Computer Science, Yonsei University)
Chang, Joong-Hyuk (Dept. of Computer & Information Technology, Daegu University)
Since an e-mail has been an important mean of communication and information sharing, there have been much effort to classify e-mails efficiently by their contents. An e-mail has various forms in length and style, and words used in an e-mail are usually irregular. In addition, the criteria of an e-mail classification are subjective. As a result, it is quite difficult for the conventional text classification technique to be adapted to an e-mail classification efficiently. An e-mail classification technique in a commercial e-mail program uses a simple text filtering technique in an e-mail client. In the previous studies on automatic classification of an e-mail, the Naive Bayesian technique based on the probability has been used to improve the classification accuracy, and most of them are on an e-mail in English. This paper proposes the personalized recommendation technique of an email in Korean using a data mining technique of frequent patterns. The proposed technique consists of two phases such as the pre-processing of e-mails in an e-mail folder and the generating a profile for the e-mail folder. The generated profile is used for an e-mail to be classified into the most appropriate e-mail folder by the subjective criteria. The e-mail classification system is also implemented, which adapts the proposed technique.
E-mail classification; Customized recommendation; Customized classification; Frequent itemsets; Document classification;
Citations & Related Records
Times Cited By KSCI : 5  (Citation Analysis)
연도 인용수 순위
1 Liu, B., W. Hsu, Y. Ma, "Integrating classification and association rule mining," in Proceedings of the fourth International Conference on Knowledge Discovery and Data Mining, pp. 80-86, 1998.
2 Tan, P.-N., Introduction to Data Mining. INFINITY BOOKS, 2007.
3 Wood, D., Internet Email Programming. Hanbit Media, 2000.
4 HAM,
5 Apriori program,
6 O.-R. Jeong, D.-S. Cho, "A Recommendation Agent System for E-mail Classification," The Proc. of the KISS Spring Conference, pp. 94-96, 2003.
7 J.M. Lee, "An Improvement of Accuracy for NaiveBayes by Using Large Word Sets," Journal of Korean Society for Internet Information, Vol. 7, No. 3, pp. 169-178, 2006.
8 S.J. Ko and J.H. Lee, "Weighted Bayesian Automatic Document Categorization Based on Association Word Knowledge Base by Apriori Algorithm," Journal of Korea Multimedia Society, Vol. 4, No. 2, pp. 171-181, 2001.
9 S.J. Ko and J.H. Lee, "Bayesian Automatic Document Categorization Using Apriori - Genetic Algorithm," Journal of the KIPS, Vol. 8, No. 3, pp. 251-260, 2001.
10 M. Ryu, J.S. Park, and J.K. Kim, "A Knowledge-based Folder Recommendation Procedure for e-mail Classification," The Proc. of the KIISS Fall Conference, pp. 349-357, 2004.
11 K. P. Kim, Y. S. Kwon, "Performance Comparison of Naive Bayesian Learning and Centroid-Based Classification for e-Mail Classification," IE Interface, Vol. 18, No. 1. pp. 10-21, 2005.
12 Diao, Y., H. Lu, and D. Wu, "A Comparative Study of Classification Based Personal E-mail Filtering," in Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications, pp. 408-419, 2000.
13 H.-J. Kim, J. J. Jeong, and G.-S. Jo, "Spam-Mail Filtering System Using Weighted Bayesian Classifier," Journal of the KISS: Software and Applications, Vol. 31, No. 8, pp. 1092-1100, 2004.
14 Yin, X., J. Han, "CPAR: Classification based on Predictive Association Rules," in Proceedings of the third SIAM International Conference on Data Mining, pp. 331-334, 2003.
15 O.-R. Jeong, D.-S. Cho, "A Three-Step Preprocessing Algorithm for Enhanced Classification of E-Mail Recommendation System," The Transactions of The Korean Institute of Electrical Engineers, Vol. 54D, No. 4, pp. 251-258, 2005.