• Title/Summary/Keyword: language processing

Search Result 2,686, Processing Time 0.026 seconds

Korean Named Entity Recognition Based on Supervised Learning Using Named Entily Construction Principles (개체명 구성 원리를 이용한 교사학습 기반의 한국어 개체명 인식)

  • Hwang, Yi-Gyu;Lee, Hyun-Sook;Chung, Eui-Sok;Yun, Bo-Hyun;Park, Sang-Kyu
    • Annual Conference on Human and Language Technology
    • /
    • 2002.10e
    • /
    • pp.111-117
    • /
    • 2002
  • 개체명 인식은 질의응답(QA), 정보 주줄(IE), 텍스트 마이닝 시스템의 성능 향상에 중요한 역할을 담당한다. 이 논문에서는 교사학습 기반의 한국어 개체명 인식에 대해 설명한다. 한국어에서 많은 개체명들이 하나 이상의 단어로 구성되어 있으며, 개체명을 구성하는 단어 사이에는 의존 관계가 존재하고, 개체명과 개체명 주위의 단어 사이에도 문맥적 의존관계를 가지고 있다. 본 논문에서는 가변길이의 개체명과 주변 문맥의 학습을 위해 트라이그램을 이용한 HMM을 사용하였으며, 자료 부족 문제를 해소하기 위해 어휘 기반이 아닌 부개체 유형 기반의 학습을 수행하였다. 학습된 개체명 인식 시스템을 이용하여 경제 분야의 신문 기사에 대한 실험 결과, 84.4%의 정확률과 90.9%의 재현률을 보였다.

  • PDF

Effects of phonological awareness and phonological processing on language skills in 4- to 6-year old children with and without language delay (4~6세 일반아동 및 언어발달지연 아동의 음운인식 및 음운처리 능력이 언어 능력에 미치는 영향)

  • Kim, Shinyoung;Son, Jinkyeong;Yim, Dongsun
    • Phonetics and Speech Sciences
    • /
    • v.12 no.1
    • /
    • pp.51-63
    • /
    • 2020
  • Phonological awareness is a metalinguistic awareness ability of phonology and is known to predict language skills, such as reading and vocabulary skills. The purpose of this study was to investigate the relationship between phonological awareness, phonological processing, and language skills in 4- to 6-years-old typically developing (TD) children and children with language delay (LD). A total of 32 children (TD=18, LD=15) participated in this study. They performed a phonological awareness task consisting of counting, deletion, and discrimination at syllable level. Nonword Repetition, Digit Backward, Receptive & Expressive Vocabulary Test, and Grammaticality Judgment Task were performed to analyze the correlation between phonological awareness, phonological processing, and language ability. A multiple stepwise regression analysis was performed to examine the phonological awareness subtasks that predict language ability. In the TD group, the syllable categorization task significantly predicted the receptive vocabulary and the performance of the Grammaticality Judgment Task. The LD group showed that the syllable counting task significantly predicted the receptive vocabulary, the expressive vocabulary, and the performance of the Grammaticality Judgment Task. The results showed that the phonological awareness performance was significantly different between the two groups. Further, correlation analysis and regression analysis showed different results for each group. The result of the phonological awareness performance predicted the language ability of each group significantly, suggesting the importance of the meta-linguistic awareness ability of phonology.

Anaphora Resolution System for Natural Language Requirements Document in Korean based on Syntactic Structure (한국어 자연어 요구문서에서 구문 구조 기반의 조응어 처리 시스템)

  • Park, Ki-Seon;An, Dong-Un;Lee, Yong-Seok
    • The KIPS Transactions:PartB
    • /
    • v.17B no.3
    • /
    • pp.255-262
    • /
    • 2010
  • When a system is developed, requirements document is generated by requirement analysts and then translated to formal specifications by specifiers. If a formal specification can be generated automatically from a natural language requirements document, system development cost and system fault from experts' misunderstanding will be decreased. A pronoun can be classified in personal and demonstrative pronoun. In the characteristics of requirements document, the personal pronouns are almost not occurred, so we focused on the decision of antecedent for a demonstrative pronoun. For the higher accuracy in analysis of requirements document automatically, finding antecedent of demonstrative pronoun is very important for elicitation of formal requirements automatically from natural language requirements document via natural language processing. The final goal of this research is to automatically generate formal specifications from natural language requirements document. For this, this paper, based on previous research [3], proposes an anaphora resolution system to decide antecedent of pronoun using natural language processing from natural language requirements document in Korean. This paper proposes heuristic rules for the system implementation. By experiments, we got 92.45%, 69.98% as recall and precision respectively with ten requirements documents.

Discriminator of Similar Documents Using Syntactic and Semantic Analysis (구문의미분석를 이용한 유사문서 판별기)

  • Kang, Won-Seog;Hwang, Do-Sam;Kim, Jung H.
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.3
    • /
    • pp.40-51
    • /
    • 2014
  • Owing to importance of document copyright the need to detect document duplication and plagiarism is increasing. Many studies have sought to meet such need, but there are difficulties in document duplication detection due to technological limitations with the processing of natural language. This thesis designs and implements a discriminator of similar documents with natural language processing technique. This system discriminates similar documents using morphological analysis, syntactic analysis, and weight on low frequency and idiom. To evaluate the system, we analyze the correlation between human discrimination and term-based discrimination, and between human discrimination and proposed discrimination. This analysis shows that the proposed discrimination needs improving. Future research should work to define the document type and improve the processing technique appropriate for each type.

An Ontology-based Knowledge Management System - Integrated System of Web Information Extraction and Structuring Knowledge -

  • Mima, Hideki;Matsushima, Katsumori
    • Proceedings of the CALSEC Conference
    • /
    • 2005.03a
    • /
    • pp.55-61
    • /
    • 2005
  • We will introduce a new web-based knowledge management system in progress, in which XML-based web information extraction and our structuring knowledge technologies are combined using ontology-based natural language processing. Our aim is to provide efficient access to heterogeneous information on the web, enabling users to use a wide range of textual and non textual resources, such as newspapers and databases, effortlessly to accelerate knowledge acquisition from such knowledge sources. In order to achieve the efficient knowledge management, we propose at first an XML-based Web information extraction which contains a sophisticated control language to extract data from Web pages. With using standard XML Technologies in the system, our approach can make extracting information easy because of a) detaching rules from processing, b) restricting target for processing, c) Interactive operations for developing extracting rules. Then we propose a structuring knowledge system which includes, 1) automatic term recognition, 2) domain oriented automatic term clustering, 3) similarity-based document retrieval, 4) real-time document clustering, and 5) visualization. The system supports integrating different types of databases (textual and non textual) and retrieving different types of information simultaneously. Through further explanation to the specification and the implementation technique of the system, we will demonstrate how the system can accelerate knowledge acquisition on the Web even for novice users of the field.

  • PDF

Automatic Grading System for Subjective Questions Through Analyzing Question Type (질의문 유형 분석을 통한 서답형 자동 채점 시스템)

  • Kang, Won-Seog
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.2
    • /
    • pp.13-21
    • /
    • 2011
  • It is not easy to develop the system as the subjective-type evaluation has the difficulty in natural language processing. This thesis designs and implements the automatic evaluation system with natural language processing technique. To solve the degradation of general evaluation system, we define the question type and improve the performance of evaluation through the adaptive process for each question type. To evaluate the system, we analyze the correlation between human evaluation and term-based evaluation, and between human evaluation and this system evaluation. We got the better result than term-based evaluation. It needs to expand the question type and improve the adaptive processing technique for each type.

Individual Differences in Regional Gray Matter Volumes According to the Cognitive Style of Young Adults

  • Hur, Minyoung;Kim, Chobok
    • Science of Emotion and Sensibility
    • /
    • v.22 no.4
    • /
    • pp.65-74
    • /
    • 2019
  • Extant research has proposed that the Object-Spatial-Verbal cognitive style can elucidate individual differences in the preference for modality-specific information. However, no studies have yet ascertained whether this type of information processing evinces structural correlations in the brain. Therefore, the current study used voxel-based morphometry (VBM) analyses to investigate individual differences in gray matter volumes based on the Object-Spatial-Verbal cognitive style. For this purpose, ninety healthy young adults were recruited to participate in the study. They were administered the Korean version of the Object-Spatial-Verbal cognitive style questionnaire, and their anatomical brain images were scanned. The VBM results demonstrated that the participants' verbal scores were positively correlated with regional gray matter volumes (rGMVs) in the right superior temporal sulcus/superior temporal gyrus, the bilateral parahippocampal gyrus/fusiform gyrus, and the left inferior temporal gyrus. In addition, the rGMVs in these regions were negatively correlated with the relative spatial preference scores obtained by individual participants. The findings of the investigation provide anatomical evidence that the verbal cognitive style could be decidedly relevant to higher-level language processing, but not to basic language processing.

Processing Scrambled Wh-Constructions in Head-Final Languages: Dependency Resolution and Feature Checking

  • Hahn, Hye-ryeong;Hong, Seungjin
    • Language and Information
    • /
    • v.18 no.2
    • /
    • pp.59-79
    • /
    • 2014
  • This paper aims at exploring the processing mechanism of filler-gap dependency resolution and feature checking in Korean wh-constructions. Based on their findings on Japanese sentence processing, Aoshima et al. (2004) have argued that the parser posits a gap in the embedded clause in head-final languages, unlike in head-initial languages, where the parser posits a gap in the matrix clause. In order to verify their findings in the Korean context, and to further explore the mechanisms involved in processing Korean wh-constructions, the present study replicated the study done by Aoshima et al., with some modifications of problematic areas in their original design. Sixty-four Korean native speakers were presented Korean sentences containing a wh-phrase in four conditions, with word order and complementizer type as the two main factors. The participants read sentences segment-by-segment, and the reading times at each segment were measured. The reading time analysis showed that there was no such slowdown at the embedded verb in the scrambled conditions as observed in Aoshima et al. Instead, there was a clear indication of the wh-feature checking process in terms of a major slowdown at the relevant region.

  • PDF

Sentiment Analysis of COVID-19 Tweets: Impact of Pre-processing Step

  • Ayadi, Rami;Shahin, Osama R.;Ghorbel, Osama;Alanazi, Rayan;Saidi, Anouar
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.3
    • /
    • pp.206-211
    • /
    • 2021
  • Internet users are increasingly invited to express their opinions on various subjects in social networks, e-commerce sites, news sites, forums, etc. Much of this information, which describes feelings, becomes the subject of study in several areas of research such as: "Sensing opinions and analyzing feelings". It is the process of identifying the polarity of the feelings held in the opinions found in the interactions of Internet users on the web and classifying them as positive, negative, or neutral. In this article, we suggest the implementation of a sentiment analysis tool that has the role of detecting the polarity of opinions from people about COVID-19 extracted from social media (tweeter) in the Arabic language and to know the impact of the pre-processing phase on the opinions classification. The results show gaps in this area of research, first of all, the lack of resources when collecting data. Second, Arabic language is more complexes in pre-processing step, especially the dialects in the pre-treatment phase. But ultimately the results obtained are promising.

A Formal Specification and Verification of CORBA Standards

  • Kim, Mi-Hui
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.12
    • /
    • pp.3127-3137
    • /
    • 1998
  • COBRA 표준명세는 표준을 만족하는 구현에서 제공해야 할 기능뿐만 아니라 서비스 제공 모듈의 사용자 인터페이스도 IDL을 사용하여 엄격하게 정의하고 있다. CORBA 표준에 대한 확신과 신뢰성을 가지기 위해서는 IDL(Interface Definition Language)로 기술된 표준명세를 정형화하고 수학적으로 엄격히 증명할 필요가 있다. 본 논문에서는 CORBA 표준을 정형적으로 명세하고 검증할 방법을 제시한다. 먼저 표준모듈을 Larch/CORBA IDL(LCB)를 사용하여 정형적으로 명세하고, LCB의 의미론에 준하여 LCB 명세를 LSL(Larch Shared language)로 변환한다. 변환한 LCB 명세와 LSL 증명논리를 사용하여 특성을 수학적으로 증명한다. 변환기반의 LCB 의미론을 정립하여 제안한 방법의 이론적 바탕을 마련하고 CORBA 이름서비스명세에 실제 적용하여 그 효용성을 보인다.

  • PDF