• Title/Summary/Keyword: Korean morphological analyzer

Search Result 116, Processing Time 0.03 seconds

Analysis of Reconstituted Tobacco Products by Characterizing Morphological Properties of Major Structure Materials (국내외산 판상엽 구성물질의 형태적 특성 비교)

  • Sung Yong-Joo;Han Young-Lim;Kim Sam-Gon;Kim Geun-Su;Joo Jeon-Hyun;Song Tae-Won
    • Journal of the Korean Society of Tobacco Science
    • /
    • v.27 no.2
    • /
    • pp.189-194
    • /
    • 2005
  • The morphological properties of various structure materials of domestic and foreign reconstituted tobacco products(RTP) were investigated by using the Bauer-McNett classifier and the image analyzer. The results of the fiber classification showed the fraction of the bigger size structure materials was larger in a domestic RTP than that in two foreign RTPs. In case of fine fraction, the domestic RTP had bigger fine fraction than two foreign RTPs. Images of each structure materials showed the scrap in the foreign RTPs kept the original shape which were rare in the domestic RTP fractions. Those results deduced that the raw materials in a foreign RTP process might be treated separately depending on the mechanical and morphological properties, which could reduce the amount of fine generation and increase the efficiency in raw material treatment.

An Experimental Approach of Keyword Extraction in Korean-Chinese Text (국한문 혼용 텍스트 색인어 추출기법 연구 『시사총보』를 중심으로)

  • Jeong, Yoo Kyung;Ban, Jae-yu
    • Journal of the Korean Society for information Management
    • /
    • v.36 no.4
    • /
    • pp.7-19
    • /
    • 2019
  • The aim of this study is to develop a technique for keyword extraction in Korean-Chinese text in the modern period. We considered a Korean morphological analyzer and a particle in classical Chinese as a possible method for this study. We applied our method to the journal "Sisachongbo," employing proper-noun dictionaries and a list of stop words to extract index terms. The results show that our system achieved better performance than a Chinese morphological analyzer in terms of recall and precision. This study is the first research to develop an automatic indexing system in the traditional Korean-Chinese mixed text.

A Study on the Morphological Analysis of Sperm (정자의 형태학적 특성 분석에 관한 연구)

  • Paick, Jae-Seung;Jeon, Seong-Soo;Kim, Soo-Woong;Yi, Won-Jin;Park, Kwang-Suk
    • Clinical and Experimental Reproductive Medicine
    • /
    • v.24 no.2
    • /
    • pp.153-165
    • /
    • 1997
  • In male reproducible health, fertility and IVF (in-vitro fertilization), semen analysis has been most important. Semen analysis can be divided into concentration, motional and morphological analysis of sperm. The existing method which was developed earlier to analyze semen concentrated on the sperm motility analysis. To provide more useful and precise solutions for clinical problems such as infertility, semen analysis must include sperm morphological analysis. But the traditional tools for semen analysis are subjective, imprecise, inaccurate, difficult to standardize, and difficult to reproduce. Therefore, with the help of development of microcomputers and image processing techniques, we developed a new sperm morphology analyzer to overcome these problems. In this study the agreement on percent normal morphology was studied between different observers and a computerized sperm morphology analyzer on a slide-by-slide basis using strict criteria. Slides from 30 different patients from the SNUH andrology laboratory were selected randomly. Microscopic fields and sperm cells were chosen randomly and percent normal morphology was recorded. The ability of sperm morphology analyzer to repeat the same reading for normal and abnormal cells was studied. The results showed that there was no significant bias between two experienced observers. The limits of agreement were 4.1%${\sim}$-3.8%. The Pearson correlation coefficient between readers was 0.79. Between the manual and sperm morphology analyzer, the same findings were reported. In this experiments the slides were stained by two different methods, PAP and Diff-Quik staining methods. The limits of agreement were 7.2%${\sim}$-5.7% and 6.0%${\sim}$-6.3%, respectively. The Pearson correlation coefficients ware 0.76 and 0.91, respectively. The limits of agreement was tighter below 20% normal forms. In the experiments of repeatability, 52 cells stained by PAP and Diff-Quik staining methods were analyzed three times in succession. Estimating pairwise agreement, the kappa statistic for the pairs were 0.76, 0.81, 0.86, and 0.75, 0.88, 0.88 respectively. In this study it was shown that there was good agreement between manual and computerized assessment of normal and abnormal cells. The repeatability and agreement per slide of computerized sperm morphology analyzer was excellent. The computer's ability to classify normal morphology per slide is promising. Based on results obtained, this system can be of clinical value both in andrology laboratories and IVF units.

  • PDF

Automatic Word Spacing Using Raw Corpus and a Morphological Analyzer (말뭉치와 형태소 분석기를 활용한 한국어 자동 띄어쓰기)

  • Shim, Kwangseob
    • Journal of KIISE
    • /
    • v.42 no.1
    • /
    • pp.68-75
    • /
    • 2015
  • This paper proposes a method for the automatic word spacing of unsegmented Korean sentences. In our method, eojeol monograms are used for word spacing as opposed to the syllable n-grams that have been used in previous studies. The use of a Korean morphological analyzer is limited to the correction of typical word spacing errors. Our method gives a 98.06% syllable accuracy and a 94.15% eojeol recall, when 10-fold cross-validated with the Sejong corpus, after filtering out non-hangul eojeols. The processing rate is 250K eojeols or 1.8 MB per second on a typical personal computer. Syllable accuracy and eojeol recall are related to the size of the eojeol dictionary, better performance is expected with a bigger corpus.

Analyzer to Identify Phrases and the Functional Roles in Sentences: Its Architectural Aspects

  • Alam, Yukiko Sasaki
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.67-75
    • /
    • 2007
  • This paper presents the architectural aspects of the phrase analyzer that attempts to recognize phrases and identify the functional roles in the sentences in formal Japanese documents. Since the object of interest is a phrase, the current system, designed in an object-oriented architecture, contains the Phrase class, and makes use of the linguistic generalization about languages with Case markers that a phrase, whether a noun phrase, a verb phrase, a postposition (or preposition) phrase or a clause phrase, can be separated into the content and the function components. Without a dictionary, and drawing on the orthographic information on the words to parse, it also contains a class that identifies the types of characters, a class representing grammar, and a class playing the role of a controller. The system has a simple and intuitive structure, externally and internally, and therefore is easy to modify and extend.

  • PDF

A Rule-Based Analysis from Raw Korean Text to Morphologically Annotated Corpora

  • Lee, Ki-Yong;Markus Schulze
    • Language and Information
    • /
    • v.6 no.2
    • /
    • pp.105-128
    • /
    • 2002
  • Morphologically annotated corpora are the basis for many tasks of computational linguistics. Most current approaches use statistically driven methods of morphological analysis, that provide just POS-tags. While this is sufficient for some applications, a rule-based full morphological analysis also yielding lemmatization and segmentation is needed for many others. This work thus aims at 〔1〕 introducing a rule-based Korean morphological analyzer called Kormoran based on the principle of linearity that prohibits any combination of left-to-right or right-to-left analysis or backtracking and then at 〔2〕 showing how it on be used as a POS-tagger by adopting an ordinary technique of preprocessing and also by filtering out irrelevant morpho-syntactic information in analyzed feature structures. It is shown that, besides providing a basis for subsequent syntactic or semantic processing, full morphological analyzers like Kormoran have the greater power of resolving ambiguities than simple POS-taggers. The focus of our present analysis is on Korean text.

  • PDF

Keyword Retrieval-Based Korean Text Command System Using Morphological Analyzer (형태소 분석기를 이용한 키워드 검색 기반 한국어 텍스트 명령 시스템)

  • Park, Dae-Geun;Lee, Wan-Bok
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.2
    • /
    • pp.159-165
    • /
    • 2019
  • Based on deep learning technology, speech recognition method has began to be applied to commercial products, but it is still difficult to be used in the area of VR contents, since there is no easy and efficient way to process the recognized text after the speech recognition module. In this paper, we propose a Korean Language Command System, which can efficiently recognize and respond to Korean speech commands. The system consists of two components. One is a morphological analyzer to analyze sentence morphemes and the other is a retrieval based model which is usually used to develop a chatbot system. Experimental results shows that the proposed system requires only 16% commands to achieve the same level of performance when compared with the conventional string comparison method. Furthermore, when working with Google Cloud Speech module, it revealed 60.1% of success rate. Experimental results show that the proposed system is more efficient than the conventional string comparison method.

Development of Korean Sign Language Generation System using TV Caption Signal (TV 자막 신호를 이용한 한글 수화 발생 시스템의 개발)

  • Kim, Dae-Jin;Kim, Jung-Bae;Jang, Won;Bien, Zeung-Nam
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.39 no.5
    • /
    • pp.32-44
    • /
    • 2002
  • In this paper, we propose TV caption-based KSL(Korean Sign Language) generation system. Through TV caption decoder, this caption signal is transmitted to PC. Next, caption signal is segmented into meaning units by morphological analyzer in considering specific characteristics of Korean sign language. Finally, 3D KSL generation system represents the transformed morphological information by 3D visual graphics. Specifically, we propose a morphological analyzer with many pre-processing techniques for real-time capability. Our developed system is applied to real TV caption program. Through usage of the deaf, we conclude that our system is sufficiently usable compared to conventional TV caption program.

Part-Of-Speech Tagging and the Recognition of the Korean Unknown-words Based on Machine Learning (기계학습에 기반한 한국어 미등록 형태소 인식 및 품사 태깅)

  • Choi, Maeng-Sik;Kim, Hark-Soo
    • The KIPS Transactions:PartB
    • /
    • v.18B no.1
    • /
    • pp.45-50
    • /
    • 2011
  • Unknown morpheme errors in Korean morphological analysis are divided into two types: The one is the errors that a morphological analyzer entirely fails to return any morpheme sequences, and the other is the errors that a morphological analyzer returns incorrect combinations of known morphemes. Most previous unknown morpheme estimation techniques have been focused on only the former errors. This paper proposes a unknown morpheme estimation method which can handle both of the unknown morpheme errors. The proposed method detects Eojeols (Korean spacing units) that may include unknown morpheme errors using SVM (Support Vector Machine). Then, using CRFs (Conditional Random Fields), it segments morphemes from the detected Eojeols and annotates the segmented morphemes with new POS tags. In the experiments, the proposed method outperformed the conventional method based on the longest matching of functional words. Based on the experimental results, we knew that the second type errors should be dealt with in order to increase the performance of Korean morphological analysis.