• Title/Summary/Keyword: 지식 질의응답 서비스

Search Result 29, Processing Time 0.028 seconds

An Integrated Method of Iterative and Incremental Requirement Analysis for Large-Scale Systems (시스템 요구사항 분석을 위한 순환적-점진적 복합 분석방법)

  • Park, Jisung;Lee, Jaeho
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.4
    • /
    • pp.193-202
    • /
    • 2017
  • Development of Intelligent Systems involves effective integration of large-scaled knowledge processing and understanding, human-machine interaction, and intelligent services. Especially, in our project for development of a self-growing knowledge-based system with inference methodologies utilizing the big data technology, we are building a platform called WiseKB as the central knowledge base for storing massive amount of knowledge and enabling question-answering by inferences. WiseKB thus requires an effective methodology to analyze diverse requirements convoluted with the integration of various components of knowledge representation, resource management, knowledge storing, complex hybrid inference, and knowledge learning, In this paper, we propose an integrated requirement analysis method that blends the traditional sequential method and the iterative-incremental method to achieve an efficient requirement analysis for large-scale systems.

Efficient Processing of Transitive Closure Queries in Ontology using Graph Labeling (온톨로지에서의 그래프 레이블링을 이용한 효율적인 트랜지티브 클로저 질의 처리)

  • Kim Jongnam;Jung Junwon;Min Kyeung-Sub;Kim Hyoung-Joo
    • Journal of KIISE:Databases
    • /
    • v.32 no.5
    • /
    • pp.526-535
    • /
    • 2005
  • Ontology is a methodology on describing specific concepts and their relationships, and it is being considered important more and more as semantic web and variety of knowledge management systems are being highlighted. Ontology uses the relationships among concerts to represent some concrete semantics of specific concept. When we want to get some useful information from ontology, we severely have to process the transitive relationships because most of relationships among concepts represent transitivity. Technically, it causes recursive calls to process such transitive closure queries with heavy costs. This paper describes the efficient technique for processing transitive closure queries in ontology. To the purpose of it, we examine some approaches of current systems for transitive closure queries, and propose a technique by graph labeling scheme. Basically, we assume large size of ontology, and then we show that our approach gives relative efficiency in processing of transitive closure, queries.

QA Pair Passage RAG-based LLM Korean chatbot service (QA Pair Passage RAG 기반 LLM 한국어 챗봇 서비스)

  • Joongmin Shin;Jaewwook Lee;Kyungmin Kim;Taemin Lee;Sungmin Ahn;JeongBae Park;Heuiseok Lim
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.683-689
    • /
    • 2023
  • 자연어 처리 분야는 최근에 큰 발전을 보였으며, 특히 초대규모 언어 모델의 등장은 이 분야에 큰 영향을 미쳤다. GPT와 같은 모델은 다양한 NLP 작업에서 높은 성능을 보이고 있으며, 특히 챗봇 분야에서 중요하게 다루어지고 있다. 하지만, 이러한 모델에도 여러 한계와 문제점이 있으며, 그 중 하나는 모델이 기대하지 않은 결과를 생성하는 것이다. 이를 해결하기 위한 다양한 방법 중, Retrieval-Augmented Generation(RAG) 방법이 주목받았다. 이 논문에서는 지식베이스와의 통합을 통한 도메인 특화형 질의응답 시스템의 효율성 개선 방안과 벡터 데이터 베이스의 수정을 통한 챗봇 답변 수정 및 업데이트 방안을 제안한다. 본 논문의 주요 기여는 다음과 같다: 1) QA Pair Passage RAG을 활용한 새로운 RAG 시스템 제안 및 성능 향상 분석 2) 기존의 LLM 및 RAG 시스템의 성능 측정 및 한계점 제시 3) RDBMS 기반의 벡터 검색 및 업데이트를 활용한 챗봇 제어 방법론 제안

  • PDF

QualityRank : Measuring Authority of Answer in Q&A Community using Social Network Analysis (QualityRank : 소셜 네트워크 분석을 통한 Q&A 커뮤니티에서 답변의 신뢰 수준 측정)

  • Kim, Deok-Ju;Park, Gun-Woo;Lee, Sang-Hoon
    • Journal of KIISE:Databases
    • /
    • v.37 no.6
    • /
    • pp.343-350
    • /
    • 2010
  • We can get answers we want to know via questioning in Knowledge Search Service (KSS) based on Q&A Community. However, it is getting more difficult to find credible documents in enormous documents, since many anonymous users regardless of credibility are participate in answering on the question. In previous works in KSS, researchers evaluated the quality of documents based on textual information, e.g. recommendation count, click count and non-textual information, e.g. answer length, attached data, conjunction count. Then, the evaluation results are used for enhancing search performance. However, the non-textual information has a problem that it is difficult to get enough information by users in the early stage of Q&A. The textual information also has a limitation for evaluating quality because of judgement by partial factors such as answer length, conjunction counts. In this paper, we propose the QualityRank algorithm to improve the problem by textual and non-textual information. This algorithm ranks the relevant and credible answers by considering textual/non-textual information and user centrality based on Social Network Analysis(SNA). Based on experimental validation we can confirm that the results by our algorithm is improved than those of textual/non-textual in terms of ranking performance.

The Design of Efficient Learning Management System using Streaming Service (스트리밍 서비스를 이용한 효율적인 학습 관리 시스템 설계)

  • Kim, Bong-Hyun;Kim, Seong-Youn;Han, Jin-Young
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2002.11a
    • /
    • pp.265-268
    • /
    • 2002
  • 최근 컴퓨터와 인터넷의 급속한 발전과 더불어 교육 문화는 사용자들로 하여금 새로운 세상을 열어주게 되었다. 21세기 교육이 평생교육사회라는 분야로 집중화되면서 끊임없는 자기 개발로 빠르게 변화하는 지식기반 사회에서 자신의 경쟁력을 키워가고 있다. 원격교육은 바로 이러한 평생교육을 달성 할 수 있는 매우 효과적인 방식으로 언제나 어디서나 누구나 양질의 교육을 받을 수 있는 열린 교육문화로서 본 연구에서는 학습자의 철저한 관리를 통한 학습능력의 향상을 중점으로 별도의 인터넷 접속과정 없이 편리하고 개인성이 보장된 원격 교육 시스템을 구현하는데 목적을 두었다. 본 논문은 MS의 미디어 스트리밍 서비스를 이용한 동영상 강의 시청 및 조절 과정, My_SQL을 이용한 동영상 강의에 대한 서브노트 문서 제공 및 질의 응답이 이루어지는 토론형 시스템 설계, 사용자의 강의 진도 체크와 출결사항 및 수강과목을 편리하게 관리할 수 있도록 설계되어 있다. 또한 시스템의 처리 과정에서 사용자 수강 관리에 연구의 초점을 두었으며 시스템 내에서 실시간으로 이루어지는 학습자 관리를 중점으로 구성되어 있다.

  • PDF

Knowledge Extraction Methodology and Framework from Wikipedia Articles for Construction of Knowledge-Base (지식베이스 구축을 위한 한국어 위키피디아의 학습 기반 지식추출 방법론 및 플랫폼 연구)

  • Kim, JaeHun;Lee, Myungjin
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.43-61
    • /
    • 2019
  • Development of technologies in artificial intelligence has been rapidly increasing with the Fourth Industrial Revolution, and researches related to AI have been actively conducted in a variety of fields such as autonomous vehicles, natural language processing, and robotics. These researches have been focused on solving cognitive problems such as learning and problem solving related to human intelligence from the 1950s. The field of artificial intelligence has achieved more technological advance than ever, due to recent interest in technology and research on various algorithms. The knowledge-based system is a sub-domain of artificial intelligence, and it aims to enable artificial intelligence agents to make decisions by using machine-readable and processible knowledge constructed from complex and informal human knowledge and rules in various fields. A knowledge base is used to optimize information collection, organization, and retrieval, and recently it is used with statistical artificial intelligence such as machine learning. Recently, the purpose of the knowledge base is to express, publish, and share knowledge on the web by describing and connecting web resources such as pages and data. These knowledge bases are used for intelligent processing in various fields of artificial intelligence such as question answering system of the smart speaker. However, building a useful knowledge base is a time-consuming task and still requires a lot of effort of the experts. In recent years, many kinds of research and technologies of knowledge based artificial intelligence use DBpedia that is one of the biggest knowledge base aiming to extract structured content from the various information of Wikipedia. DBpedia contains various information extracted from Wikipedia such as a title, categories, and links, but the most useful knowledge is from infobox of Wikipedia that presents a summary of some unifying aspect created by users. These knowledge are created by the mapping rule between infobox structures and DBpedia ontology schema defined in DBpedia Extraction Framework. In this way, DBpedia can expect high reliability in terms of accuracy of knowledge by using the method of generating knowledge from semi-structured infobox data created by users. However, since only about 50% of all wiki pages contain infobox in Korean Wikipedia, DBpedia has limitations in term of knowledge scalability. This paper proposes a method to extract knowledge from text documents according to the ontology schema using machine learning. In order to demonstrate the appropriateness of this method, we explain a knowledge extraction model according to the DBpedia ontology schema by learning Wikipedia infoboxes. Our knowledge extraction model consists of three steps, document classification as ontology classes, proper sentence classification to extract triples, and value selection and transformation into RDF triple structure. The structure of Wikipedia infobox are defined as infobox templates that provide standardized information across related articles, and DBpedia ontology schema can be mapped these infobox templates. Based on these mapping relations, we classify the input document according to infobox categories which means ontology classes. After determining the classification of the input document, we classify the appropriate sentence according to attributes belonging to the classification. Finally, we extract knowledge from sentences that are classified as appropriate, and we convert knowledge into a form of triples. In order to train models, we generated training data set from Wikipedia dump using a method to add BIO tags to sentences, so we trained about 200 classes and about 2,500 relations for extracting knowledge. Furthermore, we evaluated comparative experiments of CRF and Bi-LSTM-CRF for the knowledge extraction process. Through this proposed process, it is possible to utilize structured knowledge by extracting knowledge according to the ontology schema from text documents. In addition, this methodology can significantly reduce the effort of the experts to construct instances according to the ontology schema.

Study on the Science & Technology Information Service Needs Corresponding to the Scientists and Engineers Group Characteristics (사용자 그룹별 과학기술정보 서비스 수요 분석)

  • Jung, Hye-Ju;Yoon, Jungsun
    • Journal of Information Management
    • /
    • v.43 no.4
    • /
    • pp.143-167
    • /
    • 2012
  • In this study, survey analysis was conducted to determine the demands of science & technology information service by the groups of users. The questionnaire was composed of the need for 20 services in the science & technology information, the need for personal information to people-to-people exchanges, and information that can be shared with others. KOSEN users 1,013 people participated in the survey, and the analysis of variance was conducted depending on institution, profession, final degree and the age of the respondents. Results of frequency analysis, there were in high demands for trend analysis, papers, research reports, patents, knowledge queries, project announcements, jobs, experimental methods, information society and study abroad/Post-doc information, and all services except mentoring, community and blog were appeared to have the significant differences depending on the groups of users. Also the personal information deemed to be necessary for interaction with others was resulted in specialization, thesis/research performances, career, organization, jobs, final degree and education in order, there were partially difference depending on the user's groups. In addition, 97% of respondents had their own scientific and technical information to be shared with other people in order of papers, presentations (ppt), reports, experimental methods and the images. The results of this study can be used as useful information for scientists and engineers to develop a user-centered personalized services and are expected to be helpful to set the direction of science information services in the future.

Study on the Accuracy of Distributed Model Under the Resolution Change (격자크기와 분포모형의 정확성에 관한 연구)

  • Ku, Hye-Jin
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2006.05a
    • /
    • pp.94-98
    • /
    • 2006
  • 미계측 유역에 대한 정확한 수문반응을 예측하기 위해 수문반응 모의할 때 발생하는 불확실성을 예측하고 감소시킬 필요가 있다. 이러한 불확실성은 사용가능한 자료의 질과 양에 따라 달라지므로, 자료의 해상도는 수문반응 예측에서 중요한 요소가 된다. 그러므로 본 연구에서는 격자크기가 수문모형의 강우-유출응답모의에 어떠한 영향을 미치는지 살펴보았다. 격자크기가 입력데이터의 정보 손실을 발생시키지 않는 경우의 모형의 불확실성을 조사하기 위하여, 2차 및 3차 하천차수와 유역면적의 증가를 고려한 4가지 가상유역을 구성 하였다. 50m, 100m, 250m, 500m, 1000m의 격자를 사용하여 강우-유출모의를 수행하고, 격자에 따른 모의결과를 비교하기 위해 유출구의 수문곡선을 작성하였다. 또한, 소유역에서 하천으로 유입되는 단위길이당 유량과 하천의 합류점 전..후 및 유출구의 하천유량에 대한 Nash 계수를 산정하고 비교하였다. 기준이 된 격자크기와의 차가 큰 격자가 사용된 경우 모의된 수문반응의 차이는 증가하였고, 대상 유역의 면적이 커질수록, 하천차수가 작을수록 그 차이는 감소하였다. 소유역에서 하천으로 유입되는 단위폭당 유량의 오차는 흐름길이가 증가할수록 감소하였다. 흐름길이가 일정한 소유역으로만 구성된 가상유역 I, II에 대한 수문모의에서 하천유량의 오차는 하천을 따라 증가한 반면, 각기 다른 흐름길이의 소유역으로 구성된 가상유역 III, IV의 경우, 오차는 하천의 흐름에 따라 일정한 경향을 갖지 않고, 하천의 합류를 통해 증가되거나 감소하였다. 이 경우, 유역 유출구의 총체적 수문반응의 오차는 1차 하천의 합류후 발생한 최대오차보다 작았다.량을 산출하여 하천환경정비를 위한 기초자료로서 활용 될 수 있도록 하였다.구에 맞는 작물 생산 및 농촌관광단지 조성을 통해 부가가치증대 및 소득증대를 꾀함으로 농촌문제 해결에 도움이 될 것으로 기대된다. 본 연구를 통해 GIS 와 RS의 기술이 농촌분야에 더 효율적으로 적용될 것으로 기대되며, 농업기술센터를 통한 정보제공을 함으로써 대농민 서비스 및 농업기관의 위상이 제고 될 것으로 기대된다.여 전자파의 공간적인 가시화를 수행할 수 있었다. 본 전자파 시뮬레이션 기법이 실무에 이용될 경우, 일반인이 전자파의 분포에 대한 전문지식을 습득할 필요 없이, 검색하고자 하는 지역과 송전선, 전철 등 각종 전자파의 발생 공간 객체를 선택하여 실생활과 관련된 전자파 정보에 예측할 수 있어, 대민 환경정보 서비스 질의 개선측면에서 획기적인 계기를 마련할 것으로 사료된다.acid$(C_{18:3})$가 대부분을 차지하였다. 야생 돌복숭아 과육 중의 지방산 조성은 포화지방산이 16.74%, 단불포화지방산 17.51% 및 다불포화지방산이 65.73%의 함유 비율을 보였는데, 이 중 다불포화지방산인 n-6계 linoleic acid$(C_{18:2})$와 n-3계 linolenic acid$(C_{18:3})$가 지질 구성 총 지방산의 대부분을 차지하는 함유 비율을 나타내었다.했다. 하강하는 약 4일간의 기상변화가 자발성 기흉 발생에 영향을 미친다고 추론할 수 있었다. 향후 본 연구에서 추론된 기상변화와 기흉 발생과의 인과관계를 확인하고 좀 더 구체화하기 위한 연구가 필요할 것이다.게 이루어질 수 있을 것으로 기대된다.

  • PDF

KB-BERT: Training and Application of Korean Pre-trained Language Model in Financial Domain (KB-BERT: 금융 특화 한국어 사전학습 언어모델과 그 응용)

  • Kim, Donggyu;Lee, Dongwook;Park, Jangwon;Oh, Sungwoo;Kwon, Sungjun;Lee, Inyong;Choi, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.191-206
    • /
    • 2022
  • Recently, it is a de-facto approach to utilize a pre-trained language model(PLM) to achieve the state-of-the-art performance for various natural language tasks(called downstream tasks) such as sentiment analysis and question answering. However, similar to any other machine learning method, PLM tends to depend on the data distribution seen during the training phase and shows worse performance on the unseen (Out-of-Distribution) domain. Due to the aforementioned reason, there have been many efforts to develop domain-specified PLM for various fields such as medical and legal industries. In this paper, we discuss the training of a finance domain-specified PLM for the Korean language and its applications. Our finance domain-specified PLM, KB-BERT, is trained on a carefully curated financial corpus that includes domain-specific documents such as financial reports. We provide extensive performance evaluation results on three natural language tasks, topic classification, sentiment analysis, and question answering. Compared to the state-of-the-art Korean PLM models such as KoELECTRA and KLUE-RoBERTa, KB-BERT shows comparable performance on general datasets based on common corpora like Wikipedia and news articles. Moreover, KB-BERT outperforms compared models on finance domain datasets that require finance-specific knowledge to solve given problems.