• Title/Summary/Keyword: Knowledge extraction

Search Result 384, Processing Time 0.028 seconds

An effective color extraction and interactive insertion technique for converting PDF documents to EPUB3.0 format (PDF문서를 EPUB3.0 포맷으로 변환을 위한 효과적 색 추출 및 상호작용 효과삽입기법)

  • Lee, Namhui;Kim, Kangseok;Kim, Jai-Hoon;Byun, Louis
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.04a
    • /
    • pp.968-970
    • /
    • 2015
  • 기존 책 문서인 PDP 문서를 전자책에서도 억세스 할 수 있도록 전자책의 표준 형태로 변환하는 과정이 필요하다. PDP 문서를 전자책의 대표적인 표준 형태인 EPUB3.0으로 변환할 때, 인쇄 색상 표현방법인 CMYK를 디지털 색상 RGB 형태로 변환하는데 형태의 차이로 인하여 색감이 제대로 변환되지 못하는 문제점이 있다. 본 연구에서는 변환 시 색감을 잃지 않도록 ICC 프로파일을 이용한 변환 연구를 수행하였다. 또한 전자책 독자들을 위한 상호 작용적인 시각적인 효과를 제공하기 위하여, 많은 부분의 텍스트 중 특정 부분을 인식하여 효과 코드를 넣는 알고리즘을 제안하였다.

A Methodology for Searching Frequent Pattern Using Graph-Mining Technique (그래프마이닝을 활용한 빈발 패턴 탐색에 관한 연구)

  • Hong, June Seok
    • Journal of Information Technology Applications and Management
    • /
    • v.26 no.1
    • /
    • pp.65-75
    • /
    • 2019
  • As the use of semantic web based on XML increases in the field of data management, a lot of studies to extract useful information from the data stored in ontology have been tried based on association rule mining. Ontology data is advantageous in that data can be freely expressed because it has a flexible and scalable structure unlike a conventional database having a predefined structure. On the contrary, it is difficult to find frequent patterns in a uniformized analysis method. The goal of this study is to provide a basis for extracting useful knowledge from ontology by searching for frequently occurring subgraph patterns by applying transaction-based graph mining techniques to ontology schema graph data and instance graph data constituting ontology. In order to overcome the structural limitations of the existing ontology mining, the frequent pattern search methodology in this study uses the methodology used in graph mining to apply the frequent pattern in the graph data structure to the ontology by applying iterative node chunking method. Our suggested methodology will play an important role in knowledge extraction.

Construction of a knowledge-base for safety standards to support the design of household electrical appliances (가전제품의 설계지원을 위한 안전규격 지식베이스의 구축)

  • Lee, Hyo-Seop;Han, Soon-Hung
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.11 no.4
    • /
    • pp.106-113
    • /
    • 1994
  • Household electrical appliances should be designed to satisfy safety standards. An expert system is implemented to support the design process. The general-purpose expert system shell. ART-IM which is running under MS-DOS environment, is used to construct the knowledge-base. A set of rules has been extracted from the EN 60 335-1 that is British standard specification for the safety of household and similar electrical appliances. The main focus of this paper is on codes that have systematic and mormative structures. The internal structure of the safety standard is analysed to improve the process of rule extraction.

  • PDF

Design and Implementation of Visual Information Extraction System for Education (학습용 시각 정보 인식 시스템의 설계 및 구현)

  • Shin, Hyunkyung
    • Journal of The Korean Association of Information Education
    • /
    • v.16 no.4
    • /
    • pp.483-488
    • /
    • 2012
  • As propagation of mobile smart devices is widespread, it is an observable trend that the cases of utilizing them are increasing in the school programs, and it is also anticipated that they will be very important part of the educational equipment in near future. For this reason the department of education and science technology has announced a medium and long term project on the education with smart device, which is undergoing the preparation stage, and the various academic and industrial institutes have actively produced the related research results and the application prototypes. In this paper we propose a framework on design and implementation of a visual context recognition system for educational purpose usable in the school program by utilizing a module for recognition of the texts embedded in the image captured by video camera from mobile smart device. The system proposed in this paper is consisted of the four modules, such as, image acquisition, image processing, information extraction, and knowledge representation, which are explained in details with the practical examples.

  • PDF

Automatic Extraction and Usage of Terminology Dictionary Based on Definitional Sentences Patterns in Technical Documents (기술문서 정의문 패턴을 이용한 전문용어사전 자동추출 및 활용방안)

  • Han, Hui-Jeong;Kim, Tae-Young;Doo, Hyo-Chul;Oh, Hyo-Jung
    • Journal of the Korean Society for information Management
    • /
    • v.34 no.4
    • /
    • pp.81-99
    • /
    • 2017
  • Technical documents are important research outputs generated by knowledge and information society. In order to properly use the technical documents properly, it is necessary to utilize advanced information processing techniques, such as summarization and information extraction. In this paper, to extract core information, we automatically extracted the terminologies and their definition based on definitional sentences patterns and the structure of technical documents. Based on this, we proposed the system to build a specialized terminology dictionary. And further we suggested the personalized services so that users can utilize the terminology dictionary in various ways as an knowledge memory. The results of this study will allow users to find up-to-date information faster and easier. In addition, providing a personalized terminology dictionary to users can maximize the value, usability, and retrieval efficiency of the dictionary.

Analysis of Some Online Questions with High Frequency about Dental Treatment in Korea

  • Kang, A-Reum;Go, Ye-Eun;Kim, Ka-Eun;Kim, Min-Joo;Kim, Seon-Jeong;Hwang, SooJeong
    • Journal of dental hygiene science
    • /
    • v.19 no.3
    • /
    • pp.190-197
    • /
    • 2019
  • Background: The Internet has advantages in terms of accessibility and amount of information, and the search for health information over the Internet is increasing exponentially. The purpose of this study is to analyze the information generated about some dental treatment on the internet by year. Methods: Naver Knowledge (JisikIn in Korean) which is an interactive search service was selected as the first search site in Korea. Scaling, wisdom tooth extraction, and endodontic treatment that can be paid by Korean health insurance were selected. Finally, 4,729 questions about scaling, 23,963 wisdom teeth extraction questions and 17,733 endodontic treatment questions were extracted. The question contents, the information about the questioner and the answerer, and an error of answers were investigated. Frequency analysis was used and chi-square test was used if necessary. Results: The most frequently asked questions were discomfort and dissatisfaction after the treatment. The need for treatment was the second in questions of the wisdom tooth extraction and endodontic treatment, but the health insurance benefit was the second in dental scaling. Most of the questioners didn't disclose personal information. The public answered the most in 2013~2014, but the highest percentage of the respondents was experts in 2017. Responses were mostly personal experience, but showed a tendency to decrease with years, and professional knowledge showed an increasing tendency. The error of the answer has also gradually decreased. Conclusion: Questions about dental care over the Internet are increasing exponentially, experts are responding increasingly, and errors in answers are decreasing. Nevertheless, it is necessary to pay attention to the related expert group to prevent misinformation.

Research of organized data extraction method for digital investigation in relational database system (데이터베이스 시스템에서 디지털 포렌식 조사를 위한 체계적인 데이터 추출 기법 연구)

  • Lee, Dong-Chan;Lee, Sang-Jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.22 no.3
    • /
    • pp.565-573
    • /
    • 2012
  • To investigate the business corruption, the obtainments of the business data such as personnel, manufacture, accounting and distribution etc., is absolutely necessary. Futhermore, the investigator should have the systematic extraction solution from the business data of the enterprise database, because most company manage each business data through the distributed database system, In the general business environment, the database exists in the system with upper layer application and big size file server. Besides, original resource data which input by user are distributed and stored in one or more table following the normalized rule. The earlier researches of the database structure analysis mainly handled the table relation for database's optimization and visualization. But, in the point of the digital forensic, the data, itself analysis is more important than the table relation. This paper suggests the extraction technique from the table relation which already defined in the database. Moreover, by the systematic analysis process based on the domain knowledge, analyzes the original business data structure stored in the database and proposes the solution to extract table which is related incident.

Impact of Self-Presentation Text of Airbnb Hosts on Listing Performance by Facility Type (Airbnb 숙소 유형에 따른 호스트의 자기소개 텍스트가 공유성과에 미치는 영향)

  • Sim, Ji Hwan;Kim, So Young;Chung, Yeojin
    • Knowledge Management Research
    • /
    • v.21 no.4
    • /
    • pp.157-173
    • /
    • 2020
  • In accommodation sharing economy, customers take a risk of uncertainty about product quality, which is an important factor affecting users' satisfaction. This risk can be lowered by the information disclosed by the facility provider. Self-presentation of the hosts can make a positive effect on listing performance by eliminating psychological distance through emotional interaction with users. This paper analyzed the self-presentation text provided by Airbnb hosts and found key aspects in the text. In order to extract the aspects from the text, host descriptions were separated into sentences and applied the Attention-Based Aspect Extraction method, an unsupervised neural attention model. Then, we investigated the relationship between aspects in the host description and the listing performance via linear regression models. In order to compare their impact between the three facility types(Entire home/apt, Private rooms, and Shared rooms), the interaction effects between the facility types and the aspect summaries were included in the model. We found that specific aspects had positive effects on the performance for each facility type, and provided implication on the marketing strategy to maximize the performance of the shared economy.

Development of A Framework for Robust Extraction of Regions Of Interest (환경 요인에 독립적인 관심 영역 추출을 위한 프레임워크의 개발)

  • Kim, Seong-Hoon;Lee, Kwang-Eui;Heo, Gyeong-Yong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.12
    • /
    • pp.49-57
    • /
    • 2011
  • Extraction of regions of interest (ROIs) is the first and important step for the applications in computer vision and affects the rest of the application process. However, ROI extraction can be easily affected by the environment such as illumination, camera, etc. Many applications adopt problem-specific knowledge and/or post-processing to correct the error occurred in ROI extraction. In this paper, proposed is a robust framework that could overcome the environmental change and is independent from the rest of the process. The proposed framework uses a differential image and a color distribution to extract ROIs. The color distribution can be learned on-line, which make the framework to be robust to environmental change. Even more, the components of the framework are independent each other, which makes the framework flexible and extensible. The usefulness of the proposed framework is demonstrated with the application of hand region extraction in an image sequence.

Automatic Information Extraction for Structured Web Documents (구조화된 웹 문서에 대한 자동 정보추출)

  • Yun, Bo-Hyun
    • Journal of Internet Computing and Services
    • /
    • v.6 no.3
    • /
    • pp.129-145
    • /
    • 2005
  • This paper proposes the web information extraction system that extracts the pre-defined information automatically from web documents (i.e, HTML documents) and integrates the extracted information, The system recognizes entities without lables by the probabilistic based entity recognition method and extends the existing domain knowledge semiautomatically by using the extracted data, Moreover, the system extracts the sub-linked information linked to the basic page and integrates the similar results extracted from heterogeneous sources, The experimental result shows that the system extracts the sub-linked information and uses the probabilistic based entity recognition enhances the precision significantly against the system using only the domain knowledge, Moreover, the presented system can the more various information precisely due to applying the system with flexibleness according to domains, Because bath the semiautomatic domain knowledge expansion and the probabilistic based entity recognition improve the quality of the information, the system can increase the degree of user satisfaction at its maximum. Thus, this system can satisfy the intellectual curiosity of users from movie sites, performance sites, and dining room sites, We can construct various comparison shopping mall and contribute the revitalization of e-business.

  • PDF