• Title/Summary/Keyword: Data Mining Ontology

Search Result 53, Processing Time 0.019 seconds

A Methodology for Searching Frequent Pattern Using Graph-Mining Technique (그래프마이닝을 활용한 빈발 패턴 탐색에 관한 연구)

  • Hong, June Seok
    • Journal of Information Technology Applications and Management
    • /
    • v.26 no.1
    • /
    • pp.65-75
    • /
    • 2019
  • As the use of semantic web based on XML increases in the field of data management, a lot of studies to extract useful information from the data stored in ontology have been tried based on association rule mining. Ontology data is advantageous in that data can be freely expressed because it has a flexible and scalable structure unlike a conventional database having a predefined structure. On the contrary, it is difficult to find frequent patterns in a uniformized analysis method. The goal of this study is to provide a basis for extracting useful knowledge from ontology by searching for frequently occurring subgraph patterns by applying transaction-based graph mining techniques to ontology schema graph data and instance graph data constituting ontology. In order to overcome the structural limitations of the existing ontology mining, the frequent pattern search methodology in this study uses the methodology used in graph mining to apply the frequent pattern in the graph data structure to the ontology by applying iterative node chunking method. Our suggested methodology will play an important role in knowledge extraction.

Semi-Automatic Ontology Generation about XML Documents using Data Mining Method (데이터 마이닝 기법을 이용한 XML 문서의 온톨로지 반자동 생성)

  • Gu Mi-Sug;Hwang Jeong-Hee;Ryu Keun-Ho;Hong Jang-Eui
    • The KIPS Transactions:PartD
    • /
    • v.13D no.3 s.106
    • /
    • pp.299-308
    • /
    • 2006
  • As recently XML is becoming the standard of exchanging web documents and public documentations, XML data are increasing in many areas. To retrieve the information about XML documents efficiently, the semantic web based on the ontology is appearing. The existing ontology has been constructed manually and it was time and cost consuming. Therefore in this paper, we propose the semi-automatic ontology generation technique using the data mining technique, the association rules. The proposed method solves what type and how many conceptual relationships and determines the ontology domain level for the automatic ontology generation, using the data mining algorithm. Appying the association rules to the XML documents, we intend to find out the conceptual relationships to construct the ontology, finding the frequent patterns of XML tags in the XML documents. Using the conceptual ontology domain level extracted from the data mining, we implemented the semantic web based on the ontology by XML Topic Maps (XTM) and the topic map engine, TM4J.

A Data Mining Approach for a Dynamic Development of an Ontology-Based Statistical Information System

  • Mohamed Hachem Kermani;Zizette Boufaida;Amel Lina Bensabbane;Besma Bourezg
    • Journal of Information Science Theory and Practice
    • /
    • v.11 no.2
    • /
    • pp.67-81
    • /
    • 2023
  • This paper presents a dynamic development of an ontology-based statistical information system supporting the collection, storage, processing, analysis, and the presentation of statistical knowledge at the national scale. To accomplish this, we propose a data mining technique to dynamically collect data relating to citizens from publicly available data sources; the collected data will then be structured, classified, categorized, and integrated into an ontology. Moreover, an intelligent platform is proposed in order to generate quantitative and qualitative statistical information based on the knowledge stored in the ontology. The main aims of our proposed system are to digitize administrative tasks and to provide reliable statistical information to governmental, economic, and social actors. The authorities will use the ontology-based statistical information system for strategic decision-making as it easily collects, produces, analyzes, and provides both quantitative and qualitative knowledge that will help to improve the administration and management of national political, social, and economic life.

Ontology based Preprocessing Scheme for Mining Data Streams from Sensor Networks (센서 네트워크의 데이터 스트림 마이닝을 위한 온톨로지 기반의 전처리 기법)

  • Jung, Jason J.
    • Journal of Intelligence and Information Systems
    • /
    • v.15 no.3
    • /
    • pp.67-80
    • /
    • 2009
  • By a number of sensors and sensor networks, we can collect environmental information from a certain sensor space. To discover more useful information and knowledge, we want to employ data mining methodologies to sensor data stream from such sensor spaces. In this paper, we present a novel data preprocessing scheme to improve the performances of the data mining algorithms. Especially, ontologies are applied to represent meanings of the sensor data. For evaluating the proposed method, we have collected sensor streams for about 30 days, and simulated them to compare with other approaches.

  • PDF

Heterogeneous Lifelog Mining Model in Health Big-data Platform (헬스 빅데이터 플랫폼에서 이기종 라이프로그 마이닝 모델)

  • Kang, JI-Soo;Chung, Kyungyong
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.10
    • /
    • pp.75-80
    • /
    • 2018
  • In this paper, we propose heterogeneous lifelog mining model in health big-data platform. It is an ontology-based mining model for collecting user's lifelog in real-time and providing healthcare services. The proposed method distributes heterogeneous lifelog data and processes it in real time in a cloud computing environment. The knowledge base is reconstructed by an upper ontology method suitable for the environment constructed based on the heterogeneous ontology. The restructured knowledge base generates inference rules using Jena 4.0 inference engines, and provides real-time healthcare services by rule-based inference methods. Lifelog mining constructs an analysis of hidden relationships and a predictive model for time-series bio-signal. This enables real-time healthcare services that realize preventive health services to detect changes in the users' bio-signal by exploring negative or positive correlations that are not included in the relationships or inference rules. The performance evaluation shows that the proposed heterogeneous lifelog mining model method is superior to other models with an accuracy of 0.734, a precision of 0.752.

Constructing User Preferred Anti-Spam Ontology using Data Mining Technique (데이터 마이닝 기술을 적용한 사용자 선호 스팸 대응 온톨로지 구축)

  • Kim, Jong-Wan;Kim, Hee-Jae;Kang, Sin-Jae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.2
    • /
    • pp.160-166
    • /
    • 2007
  • When a mail was given to users, each user's response could be different according to his or her preference. This paper presents a solution for this situation by constructing a user preferred ontology for anti-spam systems. To define an ontology for describing user behaviors, we applied associative classification mining to study preference information of users and their responses to emails. Generated classification rules can be represented in a formal ontology language. A user preferred ontology can explain why mail is decided to be spam or ron-spam in a meaningful way. We also suggest a new rule optimization procedure inspired from logic synthesis to improve comprehensibility and exclude redundant rules.

An Ontology-Based Labeling of Influential Topics Using Topic Network Analysis

  • Kim, Hyon Hee;Rhee, Hey Young
    • Journal of Information Processing Systems
    • /
    • v.15 no.5
    • /
    • pp.1096-1107
    • /
    • 2019
  • In this paper, we present an ontology-based approach to labeling influential topics of scientific articles. First, to look for influential topics from scientific article, topic modeling is performed, and then social network analysis is applied to the selected topic models. Abstracts of research papers related to data mining published over the 20 years from 1995 to 2015 are collected and analyzed in this research. Second, to interpret and to explain selected influential topics, the UniDM ontology is constructed from Wikipedia and serves as concept hierarchies of topic models. Our experimental results show that the subjects of data management and queries are identified in the most interrelated topic among other topics, which is followed by that of recommender systems and text mining. Also, the subjects of recommender systems and context-aware systems belong to the most influential topic, and the subject of k-nearest neighbor classifier belongs to the closest topic to other topics. The proposed framework provides a general model for interpreting topics in topic models, which plays an important role in overcoming ambiguous and arbitrary interpretation of topics in topic modeling.

Practical Text Mining for Trend Analysis: Ontology to visualization in Aerospace Technology

  • Kim, Yoosin;Ju, Yeonjin;Hong, SeongGwan;Jeong, Seung Ryul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.8
    • /
    • pp.4133-4145
    • /
    • 2017
  • Advances in science and technology are driving us to the better life but also forcing us to make more investment at the same time. Therefore, the government has provided the investment to carry on the promising futuristic technology successfully. Indeed, a lot of resources from the government have supported into the science and technology R&D projects for several decades. However, the performance of the public investments remains unclear in many ways, so thus it is required that planning and evaluation about the new investment should be on data driven decision with fact based evidence. In this regard, the government wanted to know the trend and issue of the science and technology with evidences, and has accumulated an amount of database about the science and technology such as research papers, patents, project reports, and R&D information. Nowadays, the database is supporting to various activities such as planning policy, budget allocation, and investment evaluation for the science and technology but the information quality is not reached to the expectation because of limitations of text mining to drill out the information from the unstructured data like the reports and papers. To solve the problem, this study proposes a practical text mining methodology for the science and technology trend analysis, in case of aerospace technology, and conduct text mining methods such as ontology development, topic analysis, network analysis and their visualization.

PubMine: An Ontology-Based Text Mining System for Deducing Relationships among Biological Entities

  • Kim, Tae-Kyung;Oh, Jeong-Su;Ko, Gun-Hwan;Cho, Wan-Sup;Hou, Bo-Kyeng;Lee, Sang-Hyuk
    • Interdisciplinary Bio Central
    • /
    • v.3 no.2
    • /
    • pp.7.1-7.6
    • /
    • 2011
  • Background: Published manuscripts are the main source of biological knowledge. Since the manual examination is almost impossible due to the huge volume of literature data (approximately 19 million abstracts in PubMed), intelligent text mining systems are of great utility for knowledge discovery. However, most of current text mining tools have limited applicability because of i) providing abstract-based search rather than sentence-based search, ii) improper use or lack of ontology terms, iii) the design to be used for specific subjects, or iv) slow response time that hampers web services and real time applications. Results: We introduce an advanced text mining system called PubMine that supports intelligent knowledge discovery based on diverse bio-ontologies. PubMine improves query accuracy and flexibility with advanced search capabilities of fuzzy search, wildcard search, proximity search, range search, and the Boolean combinations. Furthermore, PubMine allows users to extract multi-dimensional relationships between genes, diseases, and chemical compounds by using OLAP (On-Line Analytical Processing) techniques. The HUGO gene symbols and the MeSH ontology for diseases, chemical compounds, and anatomy have been included in the current version of PubMine, which is freely available at http://pubmine.kobic.re.kr. Conclusions: PubMine is a unique bio-text mining system that provides flexible searches and analysis of biological entity relationships. We believe that PubMine would serve as a key bioinformatics utility due to its rapid response to enable web services for community and to the flexibility to accommodate general ontology.

A Web-Based Domain Ontology Construction Modelling and Application in the Wetland Domain

  • Xing, Jun;Han, Min
    • Journal of Korea Multimedia Society
    • /
    • v.10 no.6
    • /
    • pp.754-759
    • /
    • 2007
  • Methodology of ontology building based on Web resources will not only reduce significantly the ontology construction period, but also enhance the quality of the ontology. Remarkable progress has been achieved in this regard, but they encounter similar difficulties, such as the Web data extraction and knowledge acquisition. This paper researches on the characteristics of ontology construction data, including dynamics, largeness, variation and openness and other features, and the fundamental issue of ontology construction - formalized representation method. Then, the key technologies used in and the difficulties with ontology construction are summarized. A software Model-OntoMaker (Ontology Maker) is designed. The model is innovative in two regards: (1) the improvement of generality: the meta learning machine will dynamically pick appropriate ontology learning methodologies for data of different domains, thus optimizing the results; (2) the merged processing of (semi-) structural and non-structural data. In addition, as known to all wetland researchers, information sharing is vital to wetland exploitation and protection, while wetland ontology construction is the basic task for information sharing. OntoMaker constructs the wetland ontologies, and the model in this work can also be referred to other environmental domains.

  • PDF