Abstract
Many isolated word recognition systems may generate irrelevant words for recognition results because they use only acoustic information or small amount of language information. In this paper, I propose word similarity that is used for selecting (or removing) less common words from candidates by applying Levenshtein distance. Word similarity is obtained by using positional accuracy that reflects the frequency information along to character's alignment information. This paper also discusses various improving techniques of selection of disparate words. The methods include different loss values, phone accuracy based on confusion information, weights of candidates by ranking order and partial comparisons. Through experiments, I found that the proposed methods are effective for removing heterogeneous words without loss of performance.