• Title/Summary/Keyword: 그래프 마이닝

Search Result 71, Processing Time 0.023 seconds

A News Video Mining based on Multi-modal Approach and Text Mining (멀티모달 방법론과 텍스트 마이닝 기반의 뉴스 비디오 마이닝)

  • Lee, Han-Sung;Im, Young-Hee;Yu, Jae-Hak;Oh, Seung-Geun;Park, Dai-Hee
    • Journal of KIISE:Databases
    • /
    • v.37 no.3
    • /
    • pp.127-136
    • /
    • 2010
  • With rapid growth of information and computer communication technologies, the numbers of digital documents including multimedia data have been recently exploded. In particular, news video database and news video mining have became the subject of extensive research, to develop effective and efficient tools for manipulation and analysis of news videos, because of their information richness. However, many research focus on browsing, retrieval and summarization of news videos. Up to date, it is a relatively early state to discover and to analyse the plentiful latent semantic knowledge from news videos. In this paper, we propose the news video mining system based on multi-modal approach and text mining, which uses the visual-textual information of news video clips and their scripts. The proposed system systematically constructs a taxonomy of news video stories in automatic manner with hierarchical clustering algorithm which is one of text mining methods. Then, it multilaterally analyzes the topics of news video stories by means of time-cluster trend graph, weighted cluster growth index, and network analysis. To clarify the validity of our approach, we analyzed the news videos on "The Second Summit of South and North Korea in 2007".

Automatic Keyword Extraction using Hierarchical Graph Model Based on Word Co-occurrences (단어 동시출현관계로 구축한 계층적 그래프 모델을 활용한 자동 키워드 추출 방법)

  • Song, KwangHo;Kim, Yoo-Sung
    • Journal of KIISE
    • /
    • v.44 no.5
    • /
    • pp.522-536
    • /
    • 2017
  • Keyword extraction can be utilized in text mining of massive documents for efficient extraction of subject or related words from the document. In this study, we proposed a hierarchical graph model based on the co-occurrence relationship, the intrinsic dependency relationship between words, and common sub-word in a single document. In addition, the enhanced TextRank algorithm that can reflect the influences of outgoing edges as well as those of incoming edges is proposed. Subsequently a novel keyword extraction scheme using the proposed hierarchical graph model and the enhanced TextRank algorithm is proposed to extract representative keywords from a single document. In the experiments, various evaluation methods were applied to the various subject documents in order to verify the accuracy and adaptability of the proposed scheme. As the results, the proposed scheme showed better performance than the previous schemes.

The performance of Bayesian network classifiers for predicting discrete data (이산형 자료 예측을 위한 베이지안 네트워크 분류분석기의 성능 비교)

  • Park, Hyeonjae;Hwang, Beom Seuk
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.3
    • /
    • pp.309-320
    • /
    • 2020
  • Bayesian networks, also known as directed acyclic graphs (DAG), are used in many areas of medicine, meteorology, and genetics because relationships between variables can be modeled with graphs and probabilities. In particular, Bayesian network classifiers, which are used to predict discrete data, have recently become a new method of data mining. Bayesian networks can be grouped into different models that depend on structured learning methods. In this study, Bayesian network models are learned with various properties of structure learning. The models are compared to the simplest method, the naïve Bayes model. Classification results are compared by applying learned models to various real data. This study also compares the relationships between variables in the data through graphs that appear in each model.

Application for Predicting Candidate on Election Broadcasting - A Case Study on the 20th Assembly Election - (선거방송을 위한 선거후보 당선자 예측 어플리케이션 - 제 20 대 국회의원 선거에 적용한 연구 -)

  • Yang, Geunseok;Gu, Jinwon;Roh, Minchul;Shin, Yongwoo
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2016.06a
    • /
    • pp.95-98
    • /
    • 2016
  • 민주주의의 꽃, 제 20 대 국회의원 선거가 막을 내렸다. 지난 선거에서는 방송사뿐만 아니라 정당들도 엄청난 비용 지출과 노력이 소요되었다. 한 예로, 지난 4. 13 총선거 (제 20 대 국회의원)에서 방송 3 사 출구조사 비용으로 약 66 억원 이상이 지출됐다. 그리고 정당에서는 여론조사 비용으로 약 70 억원 이상을 지출했다. 이러한 큰 비용 지출과, 담당자들의 노력을 줄이기 위해 본 논문에서는 텍스트 마이닝과 감정분석을 적용한 후보 당선자 예측 어플리케이션을 제안한다. 첫째, 소셜 그래프 모델을 소개하여 지역 구조를 발견한다. 둘째, 텍스트 마이닝 기법을 이용하여, 후보자 관련 데이터를 가공한다. 셋째, 텍스트 감정 분석을 통해 후보자의 정보를 수치화 한다. 본 논문의 성능과 효율성을 평가하기 위해, 제 20 대 국회의원 선거에 사례연구를 진행하였다. 제안한 방법이 정확도와 수학적 통계 검증을 통해 가치 있는 효율성을 보였다. 선거방송을 위한 후보자 예측 도구의 도입으로 향후 선거(방송)에서의 큰 비용과 노력을 줄이는데 도움을 줄 것이라 기대한다.

  • PDF

Mining Frequent Service Patterns using Graph (그래프를 이용한 빈발 서비스 탐사)

  • Hwang, Jeong-Hee
    • Journal of Digital Contents Society
    • /
    • v.19 no.3
    • /
    • pp.471-477
    • /
    • 2018
  • As time changes, users change their interest. In this paper, we propose a method to provide suitable service for users by dynamically weighting service interests in the context of age, timing, and seasonal changes in ubiquitous environment. Based on the service history data presented to users according to the age or season, we also offer useful services by continuously adding the most recent service rules to reflect the changing of service interest. To do this, a set of services is considered as a transaction and each service is considered as an item in a transaction. And also we represent the association of services in a graph and extract frequent service items that refer to the latest information services for users.

EDF: An Interactive Tool for Event Log Generation for Enabling Process Mining in Small and Medium-sized Enterprises

  • Frans Prathama;Seokrae Won;Iq Reviessay Pulshashi;Riska Asriana Sutrisnowati
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.6
    • /
    • pp.101-112
    • /
    • 2024
  • In this paper, we present EDF (Event Data Factory), an interactive tool designed to assist event log generation for process mining. EDF integrates various data connectors to improve its capability to assist users in connecting to diverse data sources. Our tool employs low-code/no-code technology, along with graph-based visualization, to help non-expert users understand process flow and enhance the user experience. By utilizing metadata information, EDF allows users to efficiently generate an event log containing case, activity, and timestamp attributes. Through log quality metrics, our tool enables users to assess the generated event log quality. We implement EDF under a cloud-based architecture and run a performance evaluation. Our case study and results demonstrate the usability and applicability of EDF. Finally, an observational study confirms that EDF is easy to use and beneficial, expanding small and medium-sized enterprises' (SMEs) access to process mining applications.

Morphology Representation using STT API in Rasbian OS (Rasbian OS에서 STT API를 활용한 형태소 표현에 대한 연구)

  • Woo, Park-jin;Im, Je-Sun;Lee, Sung-jin;Moon, Sang-ho
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.10a
    • /
    • pp.373-375
    • /
    • 2021
  • In the case of Korean, the possibility of development is lower than that of English if tagging is done through the word tokenization like English. Although the form of tokenizing the corpus by separating it into morpheme units via KoNLPy is represented as a graph database, full separation of voice files and verification of practicality is required when converting the module from graph database to corpus. In this paper, morphology representation using STT API is shown in Raspberry Pi. The voice file converted to Corpus is analyzed to KoNLPy and tagged. The analyzed results are represented by graph databases and can be divided into tokens divided by morpheme, and it is judged that data mining extraction with specific purpose is possible by determining practicality and degree of separation.

  • PDF

Study on Domain-dependent Keywords Co-occurring with the Adjectives of Non-deterministic Opinion (휴먼 오피니언 자동 분류 시스템 구현을 위한 비결정 오피니언 형용사 구문에 대한 연구)

  • Ahn, Ae-Lim;Han, Yong-Jin;Park, Se-Young;Nam, Jee-Sun
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2011.06c
    • /
    • pp.248-251
    • /
    • 2011
  • 본 연구에서는, 웹 문서로부터 특정 상품에 대한 의견 문장을 분석하는 오피니언 마이닝(Opinion Mining) 연구의 일환으로, 특히 함께 공기하는 자질 명사에 따라 그 극성 값이 달라지는 '비결정 오피니언어휘'의 처리를 위해서 도메인을 '맛집'으로 한정하여 공기하는 도메인 키워드의 목록을 결정하고, 이를 부분문법그래프(Local Grammar Graphs) 방법론을 통해서 이들 간의 어휘 통사적 관계를 결정해 주었다.

Large-Scale Bayesian Genetic Network Learning for Pharmacogenomics (Pharmacogenomics를 위한 대규모 베이지안 유전자망 학습)

  • 황규백;장병탁
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.10b
    • /
    • pp.139-141
    • /
    • 2001
  • Pharmacogenomics는 개인의 유전적 성향과 약물에 대한 반응간의 관계에 대해 연구하는 학문이다. 이를 위해 DNA microarray 데이터를 비롯한 대량의 생물학 데이터가 구축되고 있으며 이러한 대규모 데이터를 분석하기 위해서 기계학습과 데이터 마이닝의 여러 기법들이 이용되고 있다. 본 논문에서는 pharmacogenomics를 위한 생물학 데이터의 효율적인 분석 수단으로 베이지안망(Bayesian network)을 제시한다. 배이지안망은 다수의 변수들간의 확률적 관계를 표현하는 확률그래프모델(probabilistic graphical model)로 유전자 발현과 약물 반응 사이의 확률적 의존 관계를 분석하는데 적합하다. NC160 cell lines dataset으로부터 학습된 베이지안 유전자망(Bayesian genetic network)이 나타내는 관계는 생물학적 실험을 통해 검증된 실제 관계들을 다수 포함하며, 이는 배이지안 유전자망 분석을 통해 개략적인 유전자-유전자, 약물-약물, 유전자-약물 관계를 효율적으로 파악할 수 있음을 나타낸다.

  • PDF

A Methodology for Searching Frequent Pattern Using Graph-Mining Technique (그래프마이닝을 활용한 빈발 패턴 탐색에 관한 연구)

  • Hong, June Seok
    • Journal of Information Technology Applications and Management
    • /
    • v.26 no.1
    • /
    • pp.65-75
    • /
    • 2019
  • As the use of semantic web based on XML increases in the field of data management, a lot of studies to extract useful information from the data stored in ontology have been tried based on association rule mining. Ontology data is advantageous in that data can be freely expressed because it has a flexible and scalable structure unlike a conventional database having a predefined structure. On the contrary, it is difficult to find frequent patterns in a uniformized analysis method. The goal of this study is to provide a basis for extracting useful knowledge from ontology by searching for frequently occurring subgraph patterns by applying transaction-based graph mining techniques to ontology schema graph data and instance graph data constituting ontology. In order to overcome the structural limitations of the existing ontology mining, the frequent pattern search methodology in this study uses the methodology used in graph mining to apply the frequent pattern in the graph data structure to the ontology by applying iterative node chunking method. Our suggested methodology will play an important role in knowledge extraction.