Classification of Form-based Documents by Partitioned Feature Extraction

분할 특징 추출에 의한 양식 문서의 분류

  • 정현철 (연세대학교 전기 및 컴퓨터 공학과) ;
  • 이종현 (연세대학교 전기 및 컴퓨터 공학과) ;
  • 최영우 (숙명여자대학고 전산학과) ;
  • 김재희 (연세대학교 전기 및 컴퓨터 공학과)
  • Published : 1999.06.01

Abstract

Specially, form-based documents are easily understood, quickly processed and thus used more than the general documents. In this paper, a method to classify the documents with minimum features is proposed, not like former methods which use all possible features. To apply this characteristics. a document was first partitioned to areas of certain shape and size, then features were extracted from the partitioned area. It is also possible to sort the partitioned area by using the fact that each partitioned area has the different significance in the point of feature. In conclusion, by using proposed method of extracting features from partitioned document, the processing time decreases due to search area reduction.

Keywords