Proceedings of the Korean Society for Language and Information Conference (한국언어정보학회:학술대회논문집)
- 2002.02a
- /
- Pages.79-91
- /
- 2002
A Deterministic Method for Structural Analysis of Compound Words in Japanese
- Han, Dongli (Department of Computer Science, The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu-shi, Tokyo 182-8585, Japan) ;
- Ito, Takeshi (Department of Computer Science, The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu-shi, Tokyo 182-8585, Japan) ;
- Furugori, Teiji (Department of Computer Science, The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu-shi, Tokyo 182-8585, Japan)
- Published : 2002.02.01
Abstract
Structural analysis of compound words is necessary and an important process in natural language processing. Proposed here is a corpus- and statistics- based method for the structural analysis of compound words in Japanese. We determine the structure of a compound word by using Internet corpus and calculating the strength of word association among its constituent words. Experiments with 5, 6, 7, and 8 kanji compound words show that our method works well and its performance is better than those of other comparable studies.
Keywords