Ambiguity Resolution in Chinese Word Segmentation

  • Maosong, Sun (Tsinghua University and City University of Hong Kong) ;
  • T'sou, Benjamin-K. (City University of Hong Kong)
  • Published : 1995.02.01

Abstract

A new method for Chinese word segmentation named Conditional F'||'&'||'BMM (Forward and Backward Maximal Matching) which incorporates both bigram statistics (ie., mutual infonllation and difference of t-test between Chinese characters) and linguistic rules for ambiguity resolution is proposed in this paper The key characteristics of this model are the use of: (i) statistics which can be automatically derived from any raw corpus, (ii) a rule base for disambiguation with consistency and controlled size to be built up in a systematic way.

Keywords