• Title/Summary/Keyword: sentence processing

Search Result 323, Processing Time 0.025 seconds

Analysis of Korean Language Parsing System and Speed Improvement of Machine Learning using Feature Module (한국어 의존 관계 분석과 자질 집합 분할을 이용한 기계학습의 성능 개선)

  • Kim, Seong-Jin;Ock, Cheol-Young
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.8
    • /
    • pp.66-74
    • /
    • 2014
  • Recently a variety of study of Korean parsing system is carried out by many software engineers and linguists. The parsing system mainly uses the method of machine learning or symbol processing paradigm. But the parsing system using machine learning has long training time because the data of Korean sentence is very big. And the system shows the limited recognition rate because the data has self error. In this thesis we design system using feature module which can reduce training time and analyze the recognized rate each the number of training sentences and repetition times. The designed system uses the separated modules and sorted table for binary search. We use the refined 36,090 sentences which is extracted by Sejong Corpus. The training time is decreased about three hours and the comparison of recognized rate is the highest as 84.54% when 10,000 sentences is trained 50 times. When all training sentence(32,481) is trained 10 times, the recognition rate is 82.99%. As a result it is more efficient that the system is used the refined data and is repeated the training until it became the steady state.

Anaphoric Reference Resolution in Expository Text: The Effects of Ellipsis (설명문의 대용어 참조해결과정: 대용어와 지시사 생략 효과)

  • Lee, Jae-Ho
    • Korean Journal of Cognitive Science
    • /
    • v.21 no.2
    • /
    • pp.253-282
    • /
    • 2010
  • Two experiments were conducted to explore the effects of anaphora and demonstrative ellipsis on reference resolution. This study assumed that two type of ellipsis could be sensitive to antecedents' saliency: the reverse typicality and mention order of antecedents. The muti-task approach measured the antecedent's activation level and processing load for the conflict resolution of theories of anaphoric resolution. In Experiment 1, using ellipsis for anaphora, participants read a series of sentence pairs by self-paced and performed a probe recognition test. The results showed the main effects of antecedent's typicality and mention order in both tasks. In Experiment 2, using noun phrase without demonstrative for anaphora, participants read a series of sentence pairs by self-paced and performed a probe recognition test. The results showed main effects of mention order of antecedents for probe recognition task only. The first antecedent was recognized faster than the second one. The results of two experiments suggested that anaphora type and antecedent's saliency were dynamically interact in reference resolution for Korean.

  • PDF

A Sentence Sentiment Classification reflecting Formal and Informal Vocabulary Information (형식적 및 비형식적 어휘 정보를 반영한 문장 감정 분류)

  • Cho, Sang-Hyun;Kang, Hang-Bong
    • The KIPS Transactions:PartB
    • /
    • v.18B no.5
    • /
    • pp.325-332
    • /
    • 2011
  • Social Network Services(SNS) such as Twitter, Facebook and Myspace have gained popularity worldwide. Especially, sentiment analysis of SNS users' sentence is very important since it is very useful in the opinion mining. In this paper, we propose a new sentiment classification method of sentences which contains formal and informal vocabulary such as emoticons, and newly coined words. Previous methods used only formal vocabulary to classify sentiments of sentences. However, these methods are not quite effective because internet users use sentences that contain informal vocabulary. In addition, we construct suggest to construct domain sentiment vocabulary because the same word may represent different sentiments in different domains. Feature vectors are extracted from the sentiment vocabulary information and classified by Support Vector Machine(SVM). Our proposed method shows good performance in classification accuracy.

The Acquisition of Negatives in Five Korean Children (한국 아동의 부정사 획득)

  • Yi, Soon Hyung
    • Korean Journal of Child Studies
    • /
    • v.6 no.1
    • /
    • pp.17-40
    • /
    • 1985
  • This study investigated Korean children's early acquisition of negatives and focused on four research questions: 1) processing of negative variations; 2) the nature of negatives when negatives are completely acquired in Korean (in which meaning and form are matched in one to one mapping); 3) the validity of Bellugi's negative acquisition model in Korean; and 4) the cause of child's erroneous sentence production: limited ability or regularity in children's cognition. The language data of the five subjects (age span; 1.1 - 3.11) were collected by their parents in the natural setting of the home. The results showed that 1) the pivot form, was processed in many ways from a simple to a complicated form, such as <(X+X')+N> <(x+x')+N,Y> <(x+x') N,(y+y')>. It appeared that the children used a simple negative format to reach a one-step advanced negative format. 2) Korean negatives are divided into range of negation in the negative sentence (part or whole), strength of negation (absolute or general), functions of meaning (negation, absences, refusal, prohibition, impossibility). All five children acquired negative sentences in all functions and the complete range after 3 years of age. 3) In spite of the differences in age level, Bellugi's four stage model was in evidence; that is, Korean children's negative acquisition was almost identical with Bellugi's tour stage model in deep structure. 4) Analyses of children's error sentences showed that the sentences with errors were made not because of the children's limitation in cognitive ability but because of the strict application of regularity of rules from the original grammars. Consequently, the children produced negative sentences using two rules: the rule of additive complexity (from simple to complex) and the rule of division (from one to several).

  • PDF

Implementation of Augmentative and Alternative Communication System Using Image Dictionary and Verbal based Sentence Generation Rule (이미지 사전과 동사기반 문장 생성 규칙을 활용한 보완대체 의사소통 시스템 구현)

  • Ryu, Je;Han, Kwang-Rok
    • The KIPS Transactions:PartB
    • /
    • v.13B no.5 s.108
    • /
    • pp.569-578
    • /
    • 2006
  • The present study implemented AAC(Augmentative and Alternative Communication) system using images that speech defectives can easily understand. In particular, the implementation was focused on the portability and mobility of the AAC system as well as communication system of a more flexible form. For mobility and portability, we implemented a system operable in mobile devices such as PDA so that speech defectives can communicate as food as ordinary People at any Place using the system Moreover, in order to overcome the limitation of storage space for a large volume of image data, we implemented the AAC system in client/server structure in mobile environment. What is more, for more flexible communication, we built an image dictionary by taking verbs as the base and sub-categorizing nouns according to their corresponding verbs, and regularized the types of sentences generated according to the type of verb, centering on verbs that play the most important role in composing a sentence.

Named Entity Recognition and Dictionary Construction for Korean Title: Books, Movies, Music and TV Programs (한국어 제목 개체명 인식 및 사전 구축: 도서, 영화, 음악, TV프로그램)

  • Park, Yongmin;Lee, Jae Sung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.7
    • /
    • pp.285-292
    • /
    • 2014
  • A named entity recognition method is used to improve the performance of information retrieval systems, question answering systems, machine translation systems and so on. The targets of the named entity recognition are usually PLOs (persons, locations and organizations). They are usually proper nouns or unregistered words, and traditional named entity recognizers use these characteristics to find out named entity candidates. The titles of books, movies and TV programs have different characteristics than PLO entities. They are sometimes multiple phrases, one sentence, or special characters. This makes it difficult to find the named entity candidates. In this paper we propose a method to quickly extract title named entities from news articles and automatically build a named entity dictionary for the titles. For the candidates identification, the word phrases enclosed with special symbols in a sentence are firstly extracted, and then verified by the SVM with using feature words and their distances. For the classification of the extracted title candidates, SVM is used with the mutual information of word contexts.

Development of a Conversational Help Agent Using Approximate Pattern Matching (근사 패턴매칭을 이용한 대화형 도우미 에이전트의 개발)

  • 김수영;조성배
    • Korean Journal of Cognitive Science
    • /
    • v.13 no.4
    • /
    • pp.1-8
    • /
    • 2002
  • As Internet grows, many web sites have been built, therefore much information has been registered. Because the web sites have more information, it is more difficult that the user can find the information wanted. Therefore, to get information that user wants easily, the full-text engine may be embedded to the web site. This paper is about developing the help conversational agent for a user to find the information that he wants through conversation with agent. The proposed method is based on the pattern matching of artificial intelligence, not natural language processing. If a user inputs any sentence, the help conversational agent responds to the sentence through preprocessing and pattern matching with knowledge. The knowledge is built with the XML format. With the approximate pattern matching, the agent picks up the appropriate response with some degree of similarities. At the experiment, some different sentences with the same meaning have been entered, then the agent recognized them as the same pattern, and it made a correct answer.

  • PDF

Functional Expansion of Morphological Analyzer Based on Longest Phrase Matching For Efficient Korean Parsing (효율적인 한국어 파싱을 위한 최장일치 기반의 형태소 분석기 기능 확장)

  • Lee, Hyeon-yoeng;Lee, Jong-seok;Kang, Byeong-do;Yang, Seung-weon
    • Journal of Digital Contents Society
    • /
    • v.17 no.3
    • /
    • pp.203-210
    • /
    • 2016
  • Korean is free of omission of sentence elements and modifying scope, so managing it on morphological analyzer is better than parser. In this paper, we propose functional expansion methods of the morphological analyzer to ease the burden of parsing. This method is a longest phrase matching method. When the series of several morpheme have one syntax category by processing of Unknown-words, Compound verbs, Compound nouns, Numbers and Symbols, our method combines them into a syntactic unit. And then, it is to treat by giving them a semantic features as syntax unit. The proposed morphological analysis method removes unnecessary morphological ambiguities and deceases results of morphological analysis, so improves accuracy of tagger and parser. By empirical results, we found that our method deceases 73.4% of Parsing tree and 52.4% of parsing time on average.

The Effects of Age and Type of Imperative Statement on Behavioral Intention and Recall (명령문에 대한 행동의도와 기억에 있어서 나이와 명령문 유형이 미치는 영향)

  • Min, Dongwon
    • Journal of Digital Convergence
    • /
    • v.18 no.1
    • /
    • pp.53-58
    • /
    • 2020
  • Various imperative statements that can be represented in the way in which the product or service is used describe how or how to achieve the goals, or induce or prohibit a specific action. This study focuses on The Effects of age and type of imperative sentence (directive vs. declarative) on behavioral intention and recall. As a result of the experiment, older people who have shorter lives remaining access information in a more emotional way, so they have been rejected by directive (vs. declarative) statements that felt more negative feelings, resulting in lowered behavioral intention. Conversely, the negative feeling caused by directive statements increased salience of directive (vs. declarative) sentence for older people more, which in turn increased memory for older people. Process analysis showed that emotions when exposed to statements mediated these results. The results of this study show that in order to improve consumers' behavioral response and/or the performance of information processing, it is necessary to deeply consider their age and how to construct the statement.

Verb Sense Disambiguation using Subordinating Case Information (종속격 정보를 적용한 동사 의미 중의성 해소)

  • Park, Yo-Sep;Shin, Joon-Choul;Ock, Cheol-Young;Park, Hyuk-Ro
    • The KIPS Transactions:PartB
    • /
    • v.18B no.4
    • /
    • pp.241-248
    • /
    • 2011
  • Homographs can have multiple senses. In order to understand the meaning of a sentence, it is necessary to identify which sense isused for each word in the sentence. Previous researches on this problem heavily relied on the word co-occurrence information. However, we noticed that in case of verbs, information about subordinating cases of verbs can be utilized to further improve the performance of word sense disambiguation. Different senses require different sets of subordinating cases. In this paper, we propose the verb sense disambiguation using subordinating case information. The case information acquire postposition features in Standard Korean Dictionary. Our experiment on 12 high-frequency verb homographs shows that adding case information can improve the performance of word sense disambiguation by 1.34%, from 97.3% to 98.7%. The amount of improvement may seem marginal, we think it is meaningful because the error ratio reduced to less than a half, from 2.7% to 1.3%.