Proceedings of the IEEK Conference (대한전자공학회:학술대회논문집)
- 2003.11b
- /
- Pages.255-258
- /
- 2003
Normalized Term Frequency Weighting Method in Automatic Text Categorization
자동 문서분류에서의 정규화 용어빈도 가중치방법
Abstract
This paper defines Normalized Term Frequency Weighting method for automatic text categorization by using Box-Cox, and then it applies automatic text categorization. Box-Cox transformation is statistical transformation method which makes normalized data. This paper applies that and suggests new term frequency weighting method. Because Normalized Term Frequency is different from every term compared by existing term frequency weighting method, it is general method more than fixed weighting method such as log or root. Normalized term frequency weighting method's reasonability has been proved though experiments, used 8000 newspapers divided in 4 groups, which resulted high categorization correctness in all cases.
Keywords