• Title/Summary/Keyword: 응답 후보 추출

Search Result 29, Processing Time 0.027 seconds

A Smoke Detection Method based on Video for Early Fire-Alarming System (조기 화재 경보 시스템을 위한 비디오 기반 연기 감지 방법)

  • Truong, Tung X.;Kim, Jong-Myon
    • The KIPS Transactions:PartB
    • /
    • v.18B no.4
    • /
    • pp.213-220
    • /
    • 2011
  • This paper proposes an effective, four-stage smoke detection method based on video that provides emergency response in the event of unexpected hazards in early fire-alarming systems. In the first phase, an approximate median method is used to segment moving regions in the present frame of video. In the second phase, a color segmentation of smoke is performed to select candidate smoke regions from these moving regions. In the third phase, a feature extraction algorithm is used to extract five feature parameters of smoke by analyzing characteristics of the candidate smoke regions such as area randomness and motion of smoke. In the fourth phase, extracted five parameters of smoke are used as an input for a K-nearest neighbor (KNN) algorithm to identify whether the candidate smoke regions are smoke or non-smoke. Experimental results indicate that the proposed four-stage smoke detection method outperforms other algorithms in terms of smoke detection, providing a low false alarm rate and high reliability in open and large spaces.

Construction of Test Collection for Evaluation of Scientific Relation Extraction System (과학기술분야 용어 간 관계추출 시스템의 평가를 위한 테스트컬렉션 구축)

  • Choi, Yun-Soo;Choi, Sung-Pil;Jeong, Chang-Hoo;Yoon, Hwa-Mook;You, Beom-Jong
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2009.05a
    • /
    • pp.754-758
    • /
    • 2009
  • Extracting information in large-scale documents would be very useful not only for information retrieval but also for question answering and summarization. Even though relation extraction is very important area, it is difficult to develop and evaluate a machine learning based system without test collection. The study shows how to build test collection(KREC2008) for the relation extraction system. We extracted technology terms from abstracts of journals and selected several relation candidates between them using Wordnet. Judges who were well trained in evaluation process assigned a relation from candidates. The process provides the method with which even non-experts are able to build test collection easily. KREC2008 are open to the public for researchers and developers and will be utilized for development and evaluation of relation extraction system.

  • PDF

Design of a QA System based on Information Retrieval (정보검색기반 질의응답 시스템 설계)

  • Kim, MinKyoung;Ahn, HyeokJu;Kim, Harksoo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.04a
    • /
    • pp.816-818
    • /
    • 2015
  • 본 논문에서는 질의유형을 통한 검색기반 질의응답 시스템을 구현하기 위한 설계방법을 제안한다. 이를 위해 위키피디아 문서의 링크 데이터를 이용하여 색인 대상문서와 데이터베이스를 구축하는 색인 모델과 2-포아송 모델을 이용하여 얻은 문서들을 색인 데이터베이스를 통해 필터링하여 정답 후보문장을 추출하는 검색모델, 키워드 패턴 매칭 기반 질의유형 분류 모델을 설계하였다.

Detection of Similar Answers to Avoid Duplicate Question in Retrieval-based Automatic Question Generation (검색 기반의 질문생성에서 중복 방지를 위한 유사 응답 검출)

  • Choi, Yong-Seok;Lee, Kong Joo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.1
    • /
    • pp.27-36
    • /
    • 2019
  • In this paper, we propose a method to find the most similar answer to the user's response from the question-answer database in order to avoid generating a redundant question in retrieval-based automatic question generation system. As a question of the most similar answer to user's response may already be known to the user, the question should be removed from a set of question candidates. A similarity detector calculates a similarity between two answers by utilizing the same words, paraphrases, and sentential meanings. Paraphrases can be acquired by building a phrase table used in a statistical machine translation. A sentential meaning's similarity of two answers is calculated by an attention-based convolutional neural network. We evaluate the accuracy of the similarity detector on an evaluation set with 100 answers, and can get the 71% Mean Reciprocal Rank (MRR) score.

Technical Entity Recognition System based on Distributed Parallel Processing (분산병렬처리 기반 기술개체 인식 시스템)

  • Choi, Yun-Soo;Lee, Won-Goo;Lee, Min-Ho;Choi, Dong-Hoon;Yoon, Hwa-Mook;Cho, Min-Hee;Jeong, Han-Min
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06a
    • /
    • pp.242-244
    • /
    • 2012
  • 과학기술 문헌의 기술개체 인식에 관한 연구는 정보추출, 텍스트마이닝, 질의응답 분야 등의 선행 연구로서 다양한 통계적 방법론을 사용하여 기술개체 인식 정확률을 향상시키기 위해 연구되어 왔다. 하지만 기존의 연구는 단일-코어 또는 단일 머신 상에서 수행되었기 때문에, 폭발적으로 증가하는 문헌들에 대한 실시간 분석 요구를 처리할 수 없는 상황에 직면하고 있다. 이에 본 논문에서는 기술개체를 인식하는 과정에서 병목현상이 발생하는 작업을 "후보개체 추출 과정"의 언어처리 부분과 "개체 가중치 할당 과정"에서 통계정보를 취합하는 부분으로 분류하고, 각 작업을 하둡의 맵 작업과 리듀스 작업을 이용하여 해결하는 분산 병렬 처리 기반의 기술개체 인식 방법에 대해 살펴보고자 한다.

Query-based Answer Extraction using Korean Dependency Parsing (의존 구문 분석을 이용한 질의 기반 정답 추출)

  • Lee, Dokyoung;Kim, Mintae;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.161-177
    • /
    • 2019
  • In this paper, we study the performance improvement of the answer extraction in Question-Answering system by using sentence dependency parsing result. The Question-Answering (QA) system consists of query analysis, which is a method of analyzing the user's query, and answer extraction, which is a method to extract appropriate answers in the document. And various studies have been conducted on two methods. In order to improve the performance of answer extraction, it is necessary to accurately reflect the grammatical information of sentences. In Korean, because word order structure is free and omission of sentence components is frequent, dependency parsing is a good way to analyze Korean syntax. Therefore, in this study, we improved the performance of the answer extraction by adding the features generated by dependency parsing analysis to the inputs of the answer extraction model (Bidirectional LSTM-CRF). The process of generating the dependency graph embedding consists of the steps of generating the dependency graph from the dependency parsing result and learning the embedding of the graph. In this study, we compared the performance of the answer extraction model when inputting basic word features generated without the dependency parsing and the performance of the model when inputting the addition of the Eojeol tag feature and dependency graph embedding feature. Since dependency parsing is performed on a basic unit of an Eojeol, which is a component of sentences separated by a space, the tag information of the Eojeol can be obtained as a result of the dependency parsing. The Eojeol tag feature means the tag information of the Eojeol. The process of generating the dependency graph embedding consists of the steps of generating the dependency graph from the dependency parsing result and learning the embedding of the graph. From the dependency parsing result, a graph is generated from the Eojeol to the node, the dependency between the Eojeol to the edge, and the Eojeol tag to the node label. In this process, an undirected graph is generated or a directed graph is generated according to whether or not the dependency relation direction is considered. To obtain the embedding of the graph, we used Graph2Vec, which is a method of finding the embedding of the graph by the subgraphs constituting a graph. We can specify the maximum path length between nodes in the process of finding subgraphs of a graph. If the maximum path length between nodes is 1, graph embedding is generated only by direct dependency between Eojeol, and graph embedding is generated including indirect dependencies as the maximum path length between nodes becomes larger. In the experiment, the maximum path length between nodes is adjusted differently from 1 to 3 depending on whether direction of dependency is considered or not, and the performance of answer extraction is measured. Experimental results show that both Eojeol tag feature and dependency graph embedding feature improve the performance of answer extraction. In particular, considering the direction of the dependency relation and extracting the dependency graph generated with the maximum path length of 1 in the subgraph extraction process in Graph2Vec as the input of the model, the highest answer extraction performance was shown. As a result of these experiments, we concluded that it is better to take into account the direction of dependence and to consider only the direct connection rather than the indirect dependence between the words. The significance of this study is as follows. First, we improved the performance of answer extraction by adding features using dependency parsing results, taking into account the characteristics of Korean, which is free of word order structure and omission of sentence components. Second, we generated feature of dependency parsing result by learning - based graph embedding method without defining the pattern of dependency between Eojeol. Future research directions are as follows. In this study, the features generated as a result of the dependency parsing are applied only to the answer extraction model in order to grasp the meaning. However, in the future, if the performance is confirmed by applying the features to various natural language processing models such as sentiment analysis or name entity recognition, the validity of the features can be verified more accurately.

Systematic Forecasting Bias of Exit Poll: Analysis of Exit Poll for 2010 Local Elections (출구조사의 체계적인 예측 편향에 대한 분석: 2010년 지방선거 출구조사를 중심으로)

  • Kim, Young-Won;Choi, Yun-Jung
    • Survey Research
    • /
    • v.12 no.3
    • /
    • pp.25-48
    • /
    • 2011
  • In this paper, we overview the sample design, sampling error, non-response rate and prediction errors of the exit poll conducted for 2010 local elections and discusses how to detect a prediction bias in exit poll. To investigate the bias problem in exit poll in regional(Si-Do) level, we analyze exit poll data for 2007 presidential election and 2006 local elections as well as 2010 local elections in Korea. The measure of predictive accuracy A proposed by Martin et al.(2005) is used to assess the exit poll bias. The empirical studies based on three exit polls clearly show that there exits systematic bias in exit poll and the predictive bias of candidates affiliated to conservative party (such as Hannara-Dang) is serious in the specific regions. The result of this study on systematic bias will be very useful to improving the exit poll methodology in Korea.

  • PDF

A study on Korean multi-turn response generation using generative and retrieval model (생성 모델과 검색 모델을 이용한 한국어 멀티턴 응답 생성 연구)

  • Lee, Hodong;Lee, Jongmin;Seo, Jaehyung;Jang, Yoonna;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.13 no.1
    • /
    • pp.13-21
    • /
    • 2022
  • Recent deep learning-based research shows excellent performance in most natural language processing (NLP) fields with pre-trained language models. In particular, the auto-encoder-based language model proves its excellent performance and usefulness in various fields of Korean language understanding. However, the decoder-based Korean generative model even suffers from generating simple sentences. Also, there is few detailed research and data for the field of conversation where generative models are most commonly utilized. Therefore, this paper constructs multi-turn dialogue data for a Korean generative model. In addition, we compare and analyze the performance by improving the dialogue ability of the generative model through transfer learning. In addition, we propose a method of supplementing the insufficient dialogue generation ability of the model by extracting recommended response candidates from external knowledge information through a retrival model.

Terminology Recognition System based on Machine Learning for Scientific Document Analysis (과학 기술 문헌 분석을 위한 기계학습 기반 범용 전문용어 인식 시스템)

  • Choi, Yun-Soo;Song, Sa-Kwang;Chun, Hong-Woo;Jeong, Chang-Hoo;Choi, Sung-Pil
    • The KIPS Transactions:PartD
    • /
    • v.18D no.5
    • /
    • pp.329-338
    • /
    • 2011
  • Terminology recognition system which is a preceding research for text mining, information extraction, information retrieval, semantic web, and question-answering has been intensively studied in limited range of domains, especially in bio-medical domain. We propose a domain independent terminology recognition system based on machine learning method using dictionary, syntactic features, and Web search results, since the previous works revealed limitation on applying their approaches to general domain because their resources were domain specific. We achieved F-score 80.8 and 6.5% improvement after comparing the proposed approach with the related approach, C-value, which has been widely used and is based on local domain frequencies. In the second experiment with various combinations of unithood features, the method combined with NGD(Normalized Google Distance) showed the best performance of 81.8 on F-score. We applied three machine learning methods such as Logistic regression, C4.5, and SVMs, and got the best score from the decision tree method, C4.5.