• 제목/요약/키워드: classification of Korean characters

검색결과 248건 처리시간 0.031초

한국자소의 분류와 연속 상관빈도 (Classification of Korean Characters and Frequency of Continual Characters)

  • 김국;정병용
    • 대한인간공학회지
    • /
    • 제21권2호
    • /
    • pp.1-11
    • /
    • 2002
  • Classification of Korean characters(alphabets) and frequency data of them are studied that is essential to information process of Korean. We defined a classification of characters using the concept of 'set of 2 parts' and 'set of 3 parts', and we researched frequencies about all combinations of continual two characters. These data would be important basic data to design input device of computer, for example.

HANDWRITTEN HANGUL RECOGNITION MODEL USING MULTI-LABEL CLASSIFICATION

  • HANA CHOI
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • 제27권2호
    • /
    • pp.135-145
    • /
    • 2023
  • Recently, as deep learning technology has developed, various deep learning technologies have been introduced in handwritten recognition, greatly contributing to performance improvement. The recognition accuracy of handwritten Hangeul recognition has also improved significantly, but prior research has focused on recognizing 520 Hangul characters or 2,350 Hangul characters using SERI95 data or PE92 data. In the past, most of the expressions were possible with 2,350 Hangul characters, but as globalization progresses and information and communication technology develops, there are many cases where various foreign words need to be expressed in Hangul. In this paper, we propose a model that recognizes and combines the consonants, medial vowels, and final consonants of a Korean syllable using a multi-label classification model, and achieves a high recognition accuracy of 98.38% as a result of learning with the public data of Korean handwritten characters, PE92. In addition, this model learned only 2,350 Hangul characters, but can recognize the characters which is not included in the 2,350 Hangul characters

퍼지추론을 이용한 한글 문자 인식:최대 길이 투영에 의한 한글 문자 유형 분류 (Hangul Character Recognition Using Fuzzy Reasoning:Hangul Character Type Classification by Maximum Run Length Projenction)

  • 이근수;최형일
    • 인지과학
    • /
    • 제3권2호
    • /
    • pp.249-270
    • /
    • 1992
  • 본 논문은 입력 문자에 대한 특징 추출을 위하여 최대 길이 투영(MRLP:Maximum Run Length Project)방법을 제안한다. 제안된 최대길이 투영 방법은 잡음에 강하며 필요한 정보를 가능한 정확하고 효율적으로 추출하는데 유용하다.한글문자는 그 양이 방대하고 그 구조가 복잡하며 몬자들 사이에 밀접한 유사성이 있다.따라서 본 논문에서는 추출된 특징들에 대한 퍼지추론을 적용하여 유형 분류율의 향상을 도모하였다.사용 빈도수가 높은 인쇄체 한글 문자 917자에 대하여 실험한 결과 98.58%의 분류율을 얻었다.

인쇄체 한글 및 한자의 인식에 관한 연구 (A Study on the Printed Korean and Chinese Character Recognition)

  • 김정우;이세행
    • 한국통신학회논문지
    • /
    • 제17권11호
    • /
    • pp.1175-1184
    • /
    • 1992
  • 본 논문에서는 한자를 포함하는 한글 문서 인식을 위한 인쇄체 한글, 한자의 구분과 인식 방법에 대하여 연구하였다. 제안된 한글, 한자 구분 방법은 한글의 수직모음과 수평모음의 구조적 특징을 이용하였다. 한글은 6가지 형태로 분류하고 분류된 각 형태에 대하여 세선화 과정을 거치지 않고 모음 우선추출에 의한 자모분리를 행하고 분리된 자음에 대하여 변형된 교차거리 특징을 이용하여 인식하였다. 한자에 대해서는 획교차수의 평균치를 이용하여 전체 한자 대상문자에 대해 분류를 하였으며, 문자의 획교차수와 흑점비율 특징을 이용하여 인식하였다. 한글과 한자의 구분에서는 90.5%의 분류율을 얻었다. 한글인식에 있어서는 대상문자 명조체 2512자에 대하여 90.0%의 형태 분류율을 얻었다. 인식 결과 실험 데이타 1278자에 대하여 92.2%의 인식율을 얻었다. 한자인식에 있어서는 대상문자 4585자에 대하여 분류한 결과 최대밀집 구간은 124자로서 약 1/40 정도로 분류되었음을 알 수 있었고, 인식실험 결과 89.2%의 인식율을 얻었다.

  • PDF

Classification of Characters in Movie by Correlation Analysis of Genre and Linguistic Style

  • You, Eun-Soon;Song, Jae-Won;Park, Seung-Bo
    • 한국컴퓨터정보학회논문지
    • /
    • 제24권1호
    • /
    • pp.49-55
    • /
    • 2019
  • The character dialogue created by AI is unnatural when compared with human-made dialogue, and it can not reveal the character's personality properly in spite of remarkable development of AI. The purpose of this paper is to classify characters through the linguistic style and to investigate the relation of the specific linguistic style with the personality. We analyzed the dialogues of 92 characters selected from total 60 movies categorized four movie genres, such as romantic comedy, action, comedy and horror/thriller, using Linguistic Inquiry and Word Count (LIWC), a text analysis software. As a result, we confirmed that there is a unique language style according to genre. Especially, we could find that the emotional tone than analytical thinking are two important features to classify. They were analyzed as very important features for classification as the precision and recall is over 78% for romantic comedy and action. However, the precision and recall were 66% and 50% for comedy and horror/thriller. Their impact on classification was less than romantic comedy and action genre. The characters of romantic comedy deal with the affection between men and women using a very high value of emotional tone than analytical thinking. The characters of action genre who need rational judgment to perform mission have much greater analytical thinking than emotional tone. Additionally, in the case of comedy and horror/thriller, we analyzed that they have many kinds of characters and that characters often change their personalities in the story.

A Study on the Recognition System of the Il-Pa Stenographic Character Images using EBP Algorithm

  • Kim, Sang-Keun;Park, Gwi-Tae
    • KIEE International Transaction on Systems and Control
    • /
    • 제12D권1호
    • /
    • pp.27-32
    • /
    • 2002
  • In this paper, we would study the applicability of neural networks to the recognition process of Korean stenographic character image, applying the classification function, which is the greatest merit of those of neural networks applied to the various parts so far, to the stenographic character recognition, relatively simple classification work. Korean stenographic recognition algorithms, which recognize the characters by using some methods, have a quantitative problem that despite the simplicity of the structure, a lot of basic characters are impossible to classify into a type. They also have qualitative one that It Is not easy to classify characters fur the delicacy of the character farms. Even though this is the result of experiment under the limited environment of the basic characters, this shows the possibility that the stenographic characters can be recolonized effectively by neural network system. In this system, we got 90.86% recognition rate as an average.

  • PDF

유형의 상대적 크기를 고려한 한글문자의 유형 분류 (Tyue Classification of Korean Characters Considering Relative Type Size)

  • 김병기
    • 한국컴퓨터정보학회논문지
    • /
    • 제11권6호
    • /
    • pp.99-106
    • /
    • 2006
  • 한글과 같이 문자집합이 큰 조합 문자의 인식을 위해서는 문제공간을 줄여주는 유형분류가 큰 도움이 된다. 기존 연구들이 한글 구성원리에 치중하여 한글 유형을 정한 결과 복모음 문자에 대한 정확한 분류가 어려웠고 문자집합이 상대적으로 큰 종성 있는 문자들에 대한 세분류가 부족하여 문제공간의 분배에 어려움이 많았다. 본 논문에서는 이러한 문제들을 해결하고자 수평 투영 프로파일을 이용하여 안정적 추출이 가능한 횡모음을 우선 추출하고. 수평 투영 프로파일과 연결요소를 이용하여 종성 있는 문자들에 대하여 종성을 5가지 그룹 중 하나로 세분류 하는 유형분류 방법을 제안하였다. 기존의 유형분류 방법들이 유형간 크기 불균형을 갖는 6개 혹은 15개의 유형을 가진 반면에 제안한 방법은 균형 있고 안정적 분류가 가능한 19개의 유형을 갖는다. 한글 잦기순 1.000자에 대한 7개의 상용 글꼴자료를 사용하여 분류 시스템을 만들고 월간지에서 스캔(Scan)한 30.614자에 대한 유형 분류 실험을 통하여 제안한 방법이 다양한 글꼴과 큰 문자집합을 갖는 한글 문자의 유형분류에 효율적임을 확인하였다.

  • PDF

Adaptive Recognition System of the I1-Pa Stenographic Character Images by Using Line Scan Method and BEP

  • Kim, Sangkeun;Lee, Sungoh;Park, Gwitae
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2000년도 제15차 학술회의논문집
    • /
    • pp.354-354
    • /
    • 2000
  • In this paper, we would study the applicability of neural networks to the recognition process of Korean stenographic character image, applying the classification function, which is the greatest merit of those of neural networks applied to the various pans so far, to the stenographic character recognition, relatively simple classification work. Korean stenographic recognition algorithms, which recognize the characters by using some methods, have a quantitative problem that despite the simplicity of the structure, a lot of basic characters are impossible to classify into a type. They also have qualitative one that it is not easy to classify characters for the delicacy of the character forms. Even though this is the result of experiment under the limited environment of the basic characters, this shows the possibility that the stenographic characters can be recognized effectively by neural network system. In this system, we got 90.86% recognition rate as an average.

  • PDF

Cluster분석에 의한 재래종 담배 품종의 분류에 관하여 (Varietal Classification on the Basis of Cluster Analysis in Local Tobacco)

  • 안대진;김윤동
    • 한국연초학회지
    • /
    • 제4권1호
    • /
    • pp.37-42
    • /
    • 1982
  • Korean local and introduced varieties were classified by the cluster analysis of correlation and taxonomic distance based on nineteen growth characters. 1. Thirty six varieties can be classified into three groups(I, II, III) by WVGM (weighted variable group method) 2. Major characters for classifying cultivars were days to flowering, number of leaves, leaf length, stem diameter and width of midrib: the five characters seemed to be useful in monothetic classification. 3. Korean varieties were similar to oriental, and japanese varieties to taiwan. 4. WVGM was more accurate and meaningful than classification by WPGM (weighted paired group method) and reticulate diagram of correlation. 5. Characteristics of each group: Group I closely related to many leaves, late of maturity and broad leaf type, Group II related to medium leaves, late of maturity and narrow leaf type, Croup 19 related to few leaves, early of maturity and medium leaf type respectively.

  • PDF

고무타이어 자동분류를 위한 돌출문자 인식 (Recognition of Raised Characters for Automatic Classification of Rubber Tires)

  • 함영국;강민석;정홍규;박래홍;박귀태
    • 전자공학회논문지B
    • /
    • 제31B권4호
    • /
    • pp.77-87
    • /
    • 1994
  • This paper presents recognition of raised alphanumeric markings on rubber tires for their automatic classification. Raised alphanumeric markings on rubber tires have different characteristics as compared to those of printed characters. In the preprocessing step, we first determine the rotation angle using the Hough transform and align markings, then separate each character using vertical and horizontal projections. In the recognition step, we use several features such as width of a character, cross point, partial projection, and distance feature to recognize characters hierarchically. The computer simulation result shows that the proposed system can be successfully applied to the industrial automation of rubber tires classification.

  • PDF