• Title/Summary/Keyword: 키워드 기반 기법

Search Result 303, Processing Time 0.032 seconds

Term Mapping Methodology between Everyday Words and Legal Terms for Law Information Search System (법령정보 검색을 위한 생활용어와 법률용어 간의 대응관계 탐색 방법론)

  • Kim, Ji Hyun;Lee, Jong-Seo;Lee, Myungjin;Kim, Wooju;Hong, June Seok
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.137-152
    • /
    • 2012
  • In the generation of Web 2.0, as many users start to make lots of web contents called user created contents by themselves, the World Wide Web is overflowing by countless information. Therefore, it becomes the key to find out meaningful information among lots of resources. Nowadays, the information retrieval is the most important thing throughout the whole field and several types of search services are developed and widely used in various fields to retrieve information that user really wants. Especially, the legal information search is one of the indispensable services in order to provide people with their convenience through searching the law necessary to their present situation as a channel getting knowledge about it. The Office of Legislation in Korea provides the Korean Law Information portal service to search the law information such as legislation, administrative rule, and judicial precedent from 2009, so people can conveniently find information related to the law. However, this service has limitation because the recent technology for search engine basically returns documents depending on whether the query is included in it or not as a search result. Therefore, it is really difficult to retrieve information related the law for general users who are not familiar with legal terms in the search engine using simple matching of keywords in spite of those kinds of efforts of the Office of Legislation in Korea, because there is a huge divergence between everyday words and legal terms which are especially from Chinese words. Generally, people try to access the law information using everyday words, so they have a difficulty to get the result that they exactly want. In this paper, we propose a term mapping methodology between everyday words and legal terms for general users who don't have sufficient background about legal terms, and we develop a search service that can provide the search results of law information from everyday words. This will be able to search the law information accurately without the knowledge of legal terminology. In other words, our research goal is to make a law information search system that general users are able to retrieval the law information with everyday words. First, this paper takes advantage of tags of internet blogs using the concept for collective intelligence to find out the term mapping relationship between everyday words and legal terms. In order to achieve our goal, we collect tags related to an everyday word from web blog posts. Generally, people add a non-hierarchical keyword or term like a synonym, especially called tag, in order to describe, classify, and manage their posts when they make any post in the internet blog. Second, the collected tags are clustered through the cluster analysis method, K-means. Then, we find a mapping relationship between an everyday word and a legal term using our estimation measure to select the fittest one that can match with an everyday word. Selected legal terms are given the definite relationship, and the relations between everyday words and legal terms are described using SKOS that is an ontology to describe the knowledge related to thesauri, classification schemes, taxonomies, and subject-heading. Thus, based on proposed mapping and searching methodologies, our legal information search system finds out a legal term mapped with user query and retrieves law information using a matched legal term, if users try to retrieve law information using an everyday word. Therefore, from our research, users can get exact results even if they do not have the knowledge related to legal terms. As a result of our research, we expect that general users who don't have professional legal background can conveniently and efficiently retrieve the legal information using everyday words.

Label Embedding for Improving Classification Accuracy UsingAutoEncoderwithSkip-Connections (다중 레이블 분류의 정확도 향상을 위한 스킵 연결 오토인코더 기반 레이블 임베딩 방법론)

  • Kim, Museong;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.175-197
    • /
    • 2021
  • Recently, with the development of deep learning technology, research on unstructured data analysis is being actively conducted, and it is showing remarkable results in various fields such as classification, summary, and generation. Among various text analysis fields, text classification is the most widely used technology in academia and industry. Text classification includes binary class classification with one label among two classes, multi-class classification with one label among several classes, and multi-label classification with multiple labels among several classes. In particular, multi-label classification requires a different training method from binary class classification and multi-class classification because of the characteristic of having multiple labels. In addition, since the number of labels to be predicted increases as the number of labels and classes increases, there is a limitation in that performance improvement is difficult due to an increase in prediction difficulty. To overcome these limitations, (i) compressing the initially given high-dimensional label space into a low-dimensional latent label space, (ii) after performing training to predict the compressed label, (iii) restoring the predicted label to the high-dimensional original label space, research on label embedding is being actively conducted. Typical label embedding techniques include Principal Label Space Transformation (PLST), Multi-Label Classification via Boolean Matrix Decomposition (MLC-BMaD), and Bayesian Multi-Label Compressed Sensing (BML-CS). However, since these techniques consider only the linear relationship between labels or compress the labels by random transformation, it is difficult to understand the non-linear relationship between labels, so there is a limitation in that it is not possible to create a latent label space sufficiently containing the information of the original label. Recently, there have been increasing attempts to improve performance by applying deep learning technology to label embedding. Label embedding using an autoencoder, a deep learning model that is effective for data compression and restoration, is representative. However, the traditional autoencoder-based label embedding has a limitation in that a large amount of information loss occurs when compressing a high-dimensional label space having a myriad of classes into a low-dimensional latent label space. This can be found in the gradient loss problem that occurs in the backpropagation process of learning. To solve this problem, skip connection was devised, and by adding the input of the layer to the output to prevent gradient loss during backpropagation, efficient learning is possible even when the layer is deep. Skip connection is mainly used for image feature extraction in convolutional neural networks, but studies using skip connection in autoencoder or label embedding process are still lacking. Therefore, in this study, we propose an autoencoder-based label embedding methodology in which skip connections are added to each of the encoder and decoder to form a low-dimensional latent label space that reflects the information of the high-dimensional label space well. In addition, the proposed methodology was applied to actual paper keywords to derive the high-dimensional keyword label space and the low-dimensional latent label space. Using this, we conducted an experiment to predict the compressed keyword vector existing in the latent label space from the paper abstract and to evaluate the multi-label classification by restoring the predicted keyword vector back to the original label space. As a result, the accuracy, precision, recall, and F1 score used as performance indicators showed far superior performance in multi-label classification based on the proposed methodology compared to traditional multi-label classification methods. This can be seen that the low-dimensional latent label space derived through the proposed methodology well reflected the information of the high-dimensional label space, which ultimately led to the improvement of the performance of the multi-label classification itself. In addition, the utility of the proposed methodology was identified by comparing the performance of the proposed methodology according to the domain characteristics and the number of dimensions of the latent label space.

Analysis of Sea Trial's Title for Naval Ships Based on Big Data (빅데이터 기반 함정 시운전 종목명 분석)

  • Lee, Hyeong-Sin;Seo, Hyeong-Pil;Beak, Yong-Kawn;Lee, Sang-Il
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.11
    • /
    • pp.420-426
    • /
    • 2020
  • The purpose and main points of the ROK-US Navy were analyzed from various angles using the big data technology Word Cloud for efficient sea trials. First, a comparison of words extracted through keyword cleansing in the ROK-US Navy sea trial showed that the ROK Navy conducted a single equipment test, and the US Navy conducted an integrated test run focusing on the system. Second, an analysis of the ROK-US Navy sea trials showed that approximately 66.6% were analyzed as similar items, of which more than two items were 112 items Approximately 44% of the 252 items of the ROK Navy sea trials overlapped, and that 89 items (35% of the total) could be reduced when integrated into the US Navy sea trials. A ship is a complex system in which multiple equipment operates simultaneously. The focus on checking the functions and performance of individual equipment, such as the ROK Navy's sea trials, will increase the sea trial period because of the excessive number of sea trial targets. In addition, the budget required will inevitably increase due to an increase in schedule and evaluation costs. In the future, further research will be needed to achieve more efficient and accurate sea trials through integrated system evaluations, such as the U.S. Navy sea trials.

KMSCR: A system for managing knowledge assets of an IT consulting firm (IT 컨설팅 회사의 지적 자산 관리를 위한 지식관리시스템)

  • 김수연;황현석;서의호
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.06a
    • /
    • pp.233-239
    • /
    • 2001
  • 최근 대부분의 회사들은 업무를 수행하는데 필요한 지식과 노하우를 공유하고 재사용하기 위하여 지적 자산 관리의 중요성을 인식하고 있다. 특히 고도로 지식 집약적인 업종이라 할 수 있는 IT컨설팅 회사에서는 지적 자산의 관리가 다른 어떤 회사에서보다 큰 중요성을 가지게 된다. 컨설팅 회사에 있어서 검증이 완료된 지적 자산의 공유 및 지능적이면서도 신속한 검색은 컨설팅 서비스의 품질과 고객 만족에 직결되는 중요한 요소이다. 따라서 대부분의 컨설팅 회사들은 자사의 지식 자산을 관리하기 위하여 많은 노력을 기울이고 있다. 본 논문의 목적은 IT 컨설팅 회사예서 관리되는 다양한 형태의 지적 자산들을 중앙 관리하여 설친 고객 사이트에 흩어져 프로젝트를 수행하는 컨설턴트들이 공유할 수 있도록 함으로써 컨설팅 서비스의 생산성과 품질들 높이고자 하는데 있다 이를 위하여 건설팅 회사에서 관리되는 모든 지적 자산의 재고를 조사하여 모델링하고 이를 쉽게 저장하고 검색할 수 있는 시스템 아키텍처를 제안한다. 제안된 아키텍처를 NT 기반에서 Index server를 이용하여 시스템으로 구현하였다 (KMSCR: A Knowledge Management System for managing Consulting Resources). KMSCR에서는 컨설턴트가 찾고자 하는 검색어를 입력하면 다양한 포맷의 (.doc, .ppt, xls, .rtf, .txt, .html 등과 같은) 결과물을 관련성이 높은 순서대로 출력해 줌으로써 컨설팅 리소스를 효과적으로 재사용할 수 있도록 도와 준다. 또한 검색 시에는 미리 등록된 키워드 뿐 아니라 본문 내의 텍스트 검색까지 가능하게 함으로써 컨설팅 리소스에 대한 보다 효과적이고 효율적인 검색을 가능하게 한다.간을 성능 평가 인자로 하여 수행하였다. 논문에서 제한된 방법을 적용한 개선된 RICH-DP을 모의 실험을 통하여 분석한 결과 기존의 제한된 RICH-DP는 실시간 서비스에 대한 처리율이 낮아지며 서비스 시간이 보장되지 못했다. 따라서 실시간 서비스에 대한 새로운 제안된 기법을 제안하고 성능 평가한 결과 기존의 RICH-DP보다 성능이 향상됨을 확인 할 수 있었다.(actual world)에서 가상 관성 세계(possible inertia would)로 변화시켜서, 완수동사의 종결점(ending point)을 현실세계에서 가상의 미래 세계로 움직이는 역할을 한다. 결과적으로, IMP는 완수동사의 닫힌 완료 관점을 현실세계에서는 열린 미완료 관점으로 변환시키되, 가상 관성 세계에서는 그대로 닫힌 관점으로 유지 시키는 효과를 가진다. 한국어와 영어의 관점 변환 구문의 차이는 각 언어의 지속부사구의 어휘 목록의 전제(presupposition)의 차이로 설명된다. 본 논문은 영어의 지속부사구는 논항의 하위간격This paper will describe the application based on this approach developed by the authors in the FLEX EXPRIT IV n$^{\circ}$EP29158 in the Work-package "Knowledge Extraction & Data mining"where the information captured from digital newspapers is extracted and reused in tourist information context.terpolation performance of CNN was relatively

  • PDF

Research trends to analysis of 『Muyedobotongji』 (『무예도보통지』 연구동향 분석)

  • Kwak, Nak-hyun
    • (The)Study of the Eastern Classic
    • /
    • no.55
    • /
    • pp.193-221
    • /
    • 2014
  • This study aims to analyze trends of advanced research of "Muyedobotongji". The conclusions are as following in these. First, the number of theses related with "Muyedobotongji" is 47 in total including 29 master's theses and 18 doctor's theses. The sports science comprises the largest proportion of study including 23 master's degree and 12 doctor's degree. Besides sports science field, "Muyedobotongji" is analyzed in various study fields such as library and information, engineering, science of art and culture contents. In master's theses, They focused on practical ways of "Muyedobotongji". But "Muyedobotongji" is conducted by perspective of the humanities in doctor's theses. Second, There are 72 theses related with "Muyedobotongji" in scientific journal. Regarding these in detail, there are 35 theses in sports science, 12 theses in Korean history, 7 theses in martial arts, 5 theses in dance studies, 4 these in Korean studies, 2 theses in Chinse studies, 2 theses in art history, 1 these in Japanese literature and 1 thesis in military science. This fact helps us understand "Muyedobotongji" is studied actively in sports science field. Third, the future research directions of "Muyedobotongji" Should be considered in 3 categories. first, it needs to do interdisciplinary fusion research. Through this, it can complement insufficient parts of existing researches. Second, it needs to make standard Key words. The unified Key words are able to use communicating in different field of scientific journals without confusing. Third It needs to build data bases which are applied to martial art areas. It can provide chances for both Korean martial arts and "Muyedobotongji" to be practiced in culture contents.

Trend Analysis using Topic Modeling for Simulation Studies (토픽 모델링을 이용한 시뮬레이션 연구 동향 분석)

  • Na, Sang-Tae;Kim, Ja-Hee;Jung, Min-Ho;Ahn, Joo-Eon
    • Journal of the Korea Society for Simulation
    • /
    • v.25 no.3
    • /
    • pp.107-116
    • /
    • 2016
  • The recent diversification in terms of the scope and techniques used for simulations has highlighted the importance of analyzing state of the art trends and applying these for educational and study purposes. While qualitative methods such as literature research or experts' assessments have previously been used, such methods are in fact likely to reflect the subjective viewpoint of experts, and to involve too much time and money for the results obtained. For the purpose of an objective analysis, a quantitative analysis that included the examination of topics found in domestic academic journal articles was conducted in the present study. In this regard, simulation was found to be most actively used domestically in the electrical and electronic fields. In addition, simulation was also found to be employed for the purpose of education and entertainment in the social sciences. The results of this study are expected to help to facilitate the prediction of the direction of the development of not only the Korea Society for Simulation, but also domestic simulation studies. This study also raises the possibility of applying text mining to trend analysis, and proves that it can be a useful method for deriving future key topics and helping experts' decisions regarding quantitative data.

A Study on the Product Planning Model based on Word2Vec using On-offline Comment Analysis: Focused on the Noiseless Vertical Mouse User (온·오프라인 댓글 분석이 활용된 Word2Vec 기반 상품기획 모델연구: 버티컬 무소음마우스 사용자를 중심으로)

  • Ahn, Yeong-Hwi
    • Journal of Digital Convergence
    • /
    • v.19 no.10
    • /
    • pp.221-227
    • /
    • 2021
  • In this paper, we conducted word-to-word similarity analysis of standardized datasets collected through web crawling for 10,000 Vertical Noise Mouses using Word2Vec, and made 92 students of computer engineering use the products presented for 5 days, and conducted self-report questionnaire analysis. The questionnaire analysis was conducted by collecting the words in the form of a narrative form and presenting and selecting the top 50 words extracted from the word frequency analysis and the word similarity analysis. As a result of analyzing the similarity of e-commerce user's product review, pain (.985) and design (.963) were analyzed as the advantages of click keywords, and the disadvantages were vertical (.985) and adaptation (.948). In the descriptive frequency analysis, the most frequently selected items were Vertical (123) and Pain (118). Vertical (83) and Pain (75) were selected for the advantages of selecting the long/demerit similar words, and adaptation (89) and buttons (72) were selected for the disadvantages. Therefore, it is expected that decision makers and product planners of medium and small enterprises can be used as important data for decision making when the method applied in this study is reflected as a new product development process and a review strategy of existing products.

Exploratory Study on the Application of Blockchain for ESG Management in the Distribution Industry (유통업계 ESG 경영을 위한 블록체인 도입 탐색적 연구)

  • Yeji Choi;Jaewook Byun;Jiwon Moon;Hangbae Chang
    • Knowledge Management Research
    • /
    • v.24 no.3
    • /
    • pp.217-237
    • /
    • 2023
  • Recently, in the face of successive and unexpected global economic risks, ESG(Environmental, Social, and Governance) management has risen as an essential survival strategy for businesses. Particularly, the supply chain disruptions due to the COVID-19 pandemic have added to the uncertainty of risks, heightening the importance of ESG management in the distribution industry. In this context, the role of blockchain technology in strengthening and managing the connection between the distribution industry and ESG management has become increasingly significant. While there have been extensive proposals for business models that integrate blockchain technology into distribution, few studies have specifically focused on the feasibility and effectiveness of applying blockchain to ESG management in this field. Therefore, this study analyzed the relationship between blockchain and ESG management in the distribution industry by employing association analysis, a text mining technique, on Korean academic research. Through this, the study confirmed the possibility of implementing blockchain in the distribution industry's ESG management and presented keywords to guide future research directions. The findings obtained from this study are expected to be utilized as foundational research for future studies in constructing blockchain-based business models for ESG management in the distribution industry.

NFT(Non-Fungible Token) Patent Trend Analysis using Topic Modeling

  • Sin-Nyum Choi;Woong Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.12
    • /
    • pp.41-48
    • /
    • 2023
  • In this paper, we propose an analysis of recent trends in the NFT (Non-Fungible Token) industry using topic modeling techniques, focusing on their universal application across various industrial fields. For this study, patent data was utilized to understand industry trends. We collected data on 371 domestic and 454 international NFT-related patents registered in the patent information search service KIPRIS from 2017, when the first NFT standard was introduced, to October 2023. In the preprocessing stage, stopwords and lemmas were removed, and only noun words were extracted. For the analysis, the top 50 words by frequency were listed, and their corresponding TF-IDF values were examined to derive key keywords of the industry trends. Next, Using the LDA algorithm, we identified four major latent topics within the patent data, both domestically and internationally. We analyzed these topics and presented our findings on NFT industry trends, underpinned by real-world industry cases. While previous review presented trends from an academic perspective using paper data, this study is significant as it provides practical trend information based on data rooted in field practice. It is expected to be a useful reference for professionals in the NFT industry for understanding market conditions and generating new items.

A Study on the Perception and Experience of Daejeon Public Library Users Using Text Mining: Focusing on SNS and Online News Articles (텍스트마이닝을 활용한 대전시 공공도서관 이용자의 인식과 경험 연구 - SNS와 온라인 뉴스 기사를 중심으로 -)

  • Jiwon Choi;Seung-Jin Kwak
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.58 no.2
    • /
    • pp.363-384
    • /
    • 2024
  • This study was conducted to examine the user's experiences with the public library in Daejeon using big data analysis, focusing on the text mining technique. To know this, first, the overall evaluation and perception of users about the public library in Daejeon were explored by collecting data on social media. Second, through analysis using online news articles, the pending issues that are being discussed socially were identified. As a result of the analysis, the proportion of users with children was first high. Next, it was found that topics through LDA analysis appeared in four categories: 'cultural event/program', 'data use', 'physical environment and facilities', and 'library service'. Finally, it was confirmed that keywords for the additional construction of libraries and complex cultural spaces and the establishment of a library cooperation system appeared at the core in the news article data. Based on this, it was proposed to build a library in consideration of regional balance and to create a social parenting community network through business agreements with childcare and childcare institutions. This will contribute to identifying the policy and social trends of public libraries in Daejeon and implementing data-based public library operations that reflect local community demands.