Analysis of the Korean Tokenizing Library Module

Lee, Jae-kyung;Seo, Jin-beom;Cho, Young-bok;

Proceedings of the Korean Institute of Information and Commucation Sciences Conference (한국정보통신학회:학술대회논문집)

2021.05a
/
Pages.78-80
/
2021

The Korea Institute of Information and Commucation Engineering (한국정보통신학회)

Analysis of the Korean Tokenizing Library Module

한글 토크나이징 라이브러리 모듈 분석

Lee, Jae-kyung (Daejeon University) ;
Seo, Jin-beom (Daejeon University) ;
Cho, Young-bok (Daejeon University)

이재경 (대전대학교) ;
서진범 (대전대학교) ;
조영복 (대전대학교)

Published : 2021.05.03

PDF

Download PDF

⟨ Previous Next ⟩

Abstract

Currently, research on natural language processing (NLP) is rapidly evolving. Natural language processing is a technology that allows computers to analyze the meanings of languages used in everyday life, and is used in various fields such as speech recognition, spelling tests, and text classification. Currently, the most commonly used natural language processing library is NLTK based on English, which has a disadvantage in Korean language processing. Therefore, after introducing KonLPy and Soynlp, the Korean Tokenizing libraries, we will analyze morphology analysis and processing techniques, compare and analyze modules with Soynlp that complement KonLPy's shortcomings, and use them as natural language processing models.

현재 자연어 처리(NLP)에 대한 연구는 급속히 발전하고 있다. 자연어 처리는 인간이 일상생활에서 사용하는 언어의 의미를 분석하여 컴퓨터가 처리할 수 있도록 하는 기술로 음성인식, 맞춤법 검사, 텍스트 분류 등 여러 분야에 사용하고 있다. 현재 가장 많이 사용되는 자연어처리 라이브러리는 영어를 기준으로 한 NLTK로 한글처리에 단점을 가지고 있다. 따라서 본 논문에서는 한글 토크나이징(Tokenizing) 라이브러리인 KonLPy와 Soynlp를 소개 후 형태소 분석 및 처리 기법을 분석하고, KonLPy의 단점을 보완한 Soynlp와의 모듈을 비교·분석하여 향후 의료분야에 적합한 자연어 처리 모델로 활용하고자 한다.

Keywords

Acknowledgement

This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT) (No. 2018R1C1B5083789).

Proceedings of the Korean Institute of Information and Commucation Sciences Conference (한국정보통신학회:학술대회논문집)

Analysis of the Korean Tokenizing Library Module

한글 토크나이징 라이브러리 모듈 분석

Abstract

Keywords

Acknowledgement

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)