• Title/Summary/Keyword: 단서표현

Search Result 77, Processing Time 0.029 seconds

Reproducing Fairy Tales for Plot Identification (사건의 흐름 분석을 위한 동화의 재구성)

  • An, Seungjoo;Park, Jong C.
    • Annual Conference on Human and Language Technology
    • /
    • 2011.10a
    • /
    • pp.3-8
    • /
    • 2011
  • 텍스트의 스토리를 자동으로 이해하기 위해 텍스트에서 기술된 사건(event)을 파악하고 이들을 조합하여 스토리가 어떻게 구성되어 있는지를 파악하는 연구들이 진행되어 왔다. 하지만 이는 스토리의 깊은 의미론적 이해를 요구하는 것 이외에도 텍스트마다 상황과 일어나는 사건들이 다양하기 때문에 언어 자원이 부족한 환경에서의 처리에는 한계가 있다. 이러한 문제는 사건들을 추상화 하여 단순하게 표현할 수 있다면 스토리 이해의 자연스러움을 저해하지 않고 해결 할 수 있다. 본 논문에서는 사건들의 추상화 과정을 위한 기초 연구로서 텍스트 속 등장인물이 행하거나 당하는 사건들을 추출하여 PMI기법을 통해 사건의 흐름을 파악하고 언어학적 단서를 참조하여 스토리 이해 과정에 누락될 수 있는 사건들을 추가하여 보완하였다. 이러한 접근을 통해 등장인물이 행할 수 있는 사건들을 재구성하여 단순화하는 방법을 제시한다.

  • PDF

A Language Model and Clue based Machine Learning Method for Discovering Technology Trends from Patent Text (특허 문서 텍스트로부터의 기술 트렌드 탐지를 위한 언어 모델 및 단서 기반 기계학습 방법)

  • Tian, Yingshi;Kim, Young-Ho;Jeong, Yoon-Jae;Ryu, Ji-Hee;Myaeng, Sung-Hyon
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.5
    • /
    • pp.420-429
    • /
    • 2009
  • Patent text is a rich source for discovering technological trends. In order to automate such a discovery process, we attempt to identify phrases corresponding to the problem and its solution method which together form a technology. Problem and solution phrases are identified by a SVM classifier using features based on a combination of a language modeling approach and linguistic clues. Based on the occurrence statistics of the phrases, we identify the time span of each problem and solution and finally generate a trend. Based on our experiment, we show that the proposed semantic phrase identification method is promising with its accuracy being 77% in R-precision. We also show that the unsupervised method for discovering technological trends is meaningful.

Competitor Extraction based on Machine Learning Methods (기계학습 기반 경쟁자 자동추출 방법)

  • Lee, Chung-Hee;Kim, Hyun-Jin;Ryu, Pum-Mo;Kim, Hyun-Ki;Seo, Young-Hoon
    • Annual Conference on Human and Language Technology
    • /
    • 2012.10a
    • /
    • pp.107-112
    • /
    • 2012
  • 본 논문은 일반 텍스트에 나타나는 경쟁 관계에 있는 고유명사들을 경쟁자로 자동 추출하는 방법에 대한 것으로, 규칙 기반 방법과 기계 학습 기반 방법을 모두 제안하고 비교하였다. 제안한 시스템은 뉴스 기사를 대상으로 하였고, 문장에 경쟁관계를 나타내는 명확한 정보가 있는 경우에만 추출하는 것을 목표로 하였다. 규칙기반 경쟁어 추출 시스템은 2개의 고유명사가 경쟁관계임을 나타내는 단서단어에 기반해서 경쟁어를 추출하는 시스템이며, 경쟁표현 단서단어는 620개가 수집되어 사용됐다. 기계학습 기반 경쟁어 추출시스템은 경쟁어 추출을 경쟁어 후보에 대한 경쟁여부의 바이너리 분류 문제로 접근하였다. 분류 알고리즘은 Support Vector Machines을 사용하였고, 경쟁어 주변 문맥 정보를 대표할 수 있는 언어 독립적 5개 자질에 기반해서 모델을 학습하였다. 성능평가를 위해서 이슈화되고 있는 핫키워드 54개에 대해서 623개의 경쟁어를 뉴스 기사로부터 수집해서 평가셋을 구축하였다. 비교 평가를 위해서 기준시스템으로 연관어에 기반해서 경쟁어를 추출하는 시스템을 구현하였고, Recall/Precision/F1 성능으로 0.119/0.214/0.153을 얻었다. 제안 시스템의 실험 결과로 규칙기반 시스템은 0.793/0.207/0.328 성능을 보였고, 기계 학습기반 시스템은 0.578/0.730/0.645 성능을 보였다. Recall 성능은 규칙기반 시스템이 0.793으로 가장 좋았고, 기준시스템에 비해서 67.4%의 성능 향상이 있었다. Precision과 F1 성능은 기계학습기반 시스템이 0.730과 0.645로 가장 좋았고, 기준시스템에 비해서 각각 61.6%, 49.2%의 성능향상이 있었다. 기준시스템에 비해서 제안한 시스템이 Recall, Precision, F1 성능이 모두 대폭적으로 향상되었으므로 제안한 방법이 효과적임을 알 수 있다.

  • PDF

A Study on an Automatic Summarization System Using Verb-Based Sentence Patterns (술어기반 문형정보를 이용한 자동요약시스템에 관한 연구)

  • 최인숙;정영미
    • Journal of the Korean Society for information Management
    • /
    • v.18 no.4
    • /
    • pp.37-55
    • /
    • 2001
  • The purpose of this study is to present a text summarization system using a knowledge base containing information about verbs and their arguments that are statistically obtained from a subject domain. The system consists of two modules: the training module and the summarization module. The training module is to extract cue verbs and their basic sentence patterns by counting the frequency of verbs and case markers respectively, and the summarization module is substantiate basic sentence patterns and to generate summaries. Basic sentence patterns are substantiated by applying substantiation rules to the syntactics structure of sentences. A summary is then produced by connecting simple sentences that the are generated through the substantiation module of basic sentence patterns. ‘robbery’in the daily newspapers are selected for a test collection. The system generates natural summaries without losing any essential information by combining both cue verbs and essential arguments. In addition, the use of statistical techniques makes it possible to apply this system to other subject domains through its learning capability.

  • PDF

Recognition of Korean Implicit Citation Sentences Using Machine Learning with Lexical Features (어휘 자질 기반 기계 학습을 사용한 한국어 암묵 인용문 인식)

  • Kang, In-Su
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.8
    • /
    • pp.5565-5570
    • /
    • 2015
  • Implicit citation sentence recognition is to locate citation sentences which lacks explicit citation markers, from articles' full-text. State-of-the-art approaches exploit word ngrams, clue words, researcher's surnames, mentions of previous methods, and distance relative to nearest explicit citation sentences, etc., reaching over 50% performance. However, most previous works have been conducted on English. As for Korean, a rule-based method using positive/negative clue patterns was reported to attain the performance of 42%, requiring further improvement. This study attempted to learn to recognize implicit citation sentences from Korean literatures' full-text using Korean lexical features. Different lexical feature units such as Eojeol, morpheme, and Eumjeol were evaluated to determine proper lexical features for Korean implicit citation sentence recognition. In addition, lexical features were combined with the position features representing backward/forward proximities to explicit citation sentences, improving the performance up to over 50%.

Reading Cognitive Culture by Intentional Instruction and Convergence Analysis in Advertising Content Stories (광고콘텐츠 스토리에 담긴 의도적인 지시체와 융복합적 해석소에 의한 인지적 문화읽기)

  • Lim, Ji-Won
    • Journal of Korea Entertainment Industry Association
    • /
    • v.13 no.2
    • /
    • pp.37-45
    • /
    • 2019
  • The This study aimed at clarifying that the cognitive interpretation code is essential for college students to read the correct culture while discussing the producer's story production system for creative advertising content and the process of interpreting the meaning of advertisers and the formation of principles and beliefs. The production of advertising content aimed at persuasion should first identify anachronistic reason system based on the target audience's perception principle. A concise analysis of the experiment found key clues that confirmed that a sample of the producer's intended story would be inconsistent with the clues of information that a college student could remember. I have tried to organize a semantic analysis tool that combines these key clues and as a tool for reading culture of the right time for college students. As a result, university student inmates were able to identify one side of positive communication: reading a new cognitive symbol culture based on their subjective experience and beliefs, rather than analyzing cross-sectional analysis of the primary language and non-verbal expressions of their advertising contents. In the future, if an advertising content story producer works to identify such a process in advance, it will help persuade inmates.

An Effective Cloth Rendering using Internal Scatter Function (내부 산란함수를 이용한 효과적인 옷감 렌더링)

  • Park, Sun-Yong;Chun, Young-Jae;Oh, Kyoung-Su
    • Journal of Korea Game Society
    • /
    • v.9 no.3
    • /
    • pp.97-105
    • /
    • 2009
  • In this paper, we propose a new rendering scheme of cloth by measuring light-scattering pattern inside the cloth and reproducing using the pattern. To date, the BTF(Bidirectional Texture Function) has been one of the most appropriate method to realistically reconstruct cloth surface. However, the BTF has a couple of defects that it ultimately requires an infinite amount of data and all light effects should be used all together. We noted that internal scattering has a decisive contribution to the reality of cloth. Following this observation, we take an image of a ray of light scattering inside cloth for every position of the cloth sample and determine each pixel value by adding up all light influences arriving from its vicinity. Our method we propose in this paper provides a clue to more realistically represent cloth-like materials, which is one of the most challenging materials to express, by enabling each ray to be controlled individually.

  • PDF

A Sociological Analysis on Contents of Gender Role in the "Educational Active Program Guidance books in Preschools" ("유치원교육활동지도자료"의 성역할 내용에 대한 사회학적 분석)

  • Kim, Kyung-Soo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.8 no.6
    • /
    • pp.1590-1603
    • /
    • 2007
  • This study deals with the contents of gender role, using the method of content analysis inquiring total of 40 volumes of the $3rd{\sim}6th$ $\ulcorner$Educational Active Program Guidance Book in Preschools$\lrcorner$ (EAPGP). A lower analysis conviction of gender role was sex division of labor and occupation assortment of the man and woman who gave it and was going to analyze expression for it. The purpose of this study was to make a searching the contents of a gender role in the $3rd{\sim}6th$ EAPGP in Preschools, to offer basics document to the 7th EAPGP. The issues of this study were follows: First, how gender division of labor of man and woman was expressed in EAPGP; Second, how much an occupation of man and woman was expressed in EAPGP; Third, How contents of gender role have been changed in EAPGP. The major findings in this study could be summarized as follows: First, males are described as formal workers and females are mostly seen as housekeepers; second, jobs of males are diversified than those of females; third, gender role discrimination which gives the clues to segregate two sexes tends to be greatly reduced in the 6th EAPGP than in the 3rd, the 4th, and the 5th EAPGP.

  • PDF

Information Types and Display Methods according to the Relation between Frequency of Exposure and Degree of Cognition (노출빈도와 인지도 관계에 따른 정보의 유형과 표현기법)

  • Han, Ji-Ae;You, Si-Cheon
    • Journal of Digital Convergence
    • /
    • v.10 no.10
    • /
    • pp.497-504
    • /
    • 2012
  • Information types and display methods according to the relation between frequency of exposure and degree of cognition was suggested by this study as a way to enhance effective communication by information in aspect of user cognition. First of all, we ascertained the relation between frequency of exposure and degree of cognition by literature research for cognitive psychology and cognitive engineering psychology, results are as follows based in it. First, we suggested information types and attributes for visualization as 'Framework' which helps designers understand cognitive demands of users. Specifically, there are 4 types(STM, STA, LTM, LTA) of information according to the relation between frequency of exposure and degree of cognition, cognitive characteristics for each types and 'attributes matrix for visualization' which is consisted of 14 attributes of high -quality information and resorted by the types. Second, we suggested a guideline for display methods according to depth of information in the design process of information contents. For display methods of STM, STA information as primary information, we suggested "Attribution theory of Distinctiveness", "Advance Organizer", "Progress Closure", "Affordance", for display methods of LTM information as multidimensional information, we suggested "Modularity", "Consistency", "Mimicry", "Mnemonic Device". We had found from this study that there are distinction of status for attributes of information visualization according to information types or depth, and various display methods by them.

A Method to Solve the Entity Linking Ambiguity and NIL Entity Recognition for efficient Entity Linking based on Wikipedia (위키피디아 기반의 효과적인 개체 링킹을 위한 NIL 개체 인식과 개체 연결 중의성 해소 방법)

  • Lee, Hokyung;An, Jaehyun;Yoon, Jeongmin;Bae, Kyoungman;Ko, Youngjoong
    • Journal of KIISE
    • /
    • v.44 no.8
    • /
    • pp.813-821
    • /
    • 2017
  • Entity Linking find the meaning of an entity mention, which indicate the entity using different expressions, in a user's query by linking the entity mention and the entity in the knowledge base. This task has four challenges, including the difficult knowledge base construction problem, multiple presentation of the entity mention, ambiguity of entity linking, and NIL entity recognition. In this paper, we first construct the entity name dictionary based on Wikipedia to build a knowledge base and solve the multiple presentation problem. We then propose various methods for NIL entity recognition and solve the ambiguity of entity linking by training the support vector machine based on several features, including the similarity of the context, semantic relevance, clue word score, named entity type similarity of the mansion, entity name matching score, and object popularity score. We sequentially use the proposed two methods based on the constructed knowledge base, to obtain the good performance in the entity linking. In the result of the experiment, our system achieved 83.66% and 90.81% F1 score, which is the performance of the NIL entity recognition to solve the ambiguity of the entity linking.