• Title/Summary/Keyword: Open data mining

Search Result 115, Processing Time 0.026 seconds

Data Standardization for the Enhanced Utilization of Public Government Data (활용성 제고를 위한 공공데이터 표준화 연구)

  • Kim, Eun Jin;Kim, Minsu;Kim, Hee-Woong
    • Knowledge Management Research
    • /
    • v.20 no.4
    • /
    • pp.23-38
    • /
    • 2019
  • The Korean government has been trying to create new economic value-added and jobs by the openness and utilization of open government data. However, most of open government data has poor utilization rate. Although open government data standardization is a major cause of those inactivation, it is not sufficient to conduct empirical research on open government data itself. Based on this trend, this paper aims to find the priority area for opening data and suggests a realistic directions of standardization of open government data. Text mining and social network analysis approaches are used to analyze open government data and standardization. This research suggests the guides to open government data managers in practical view from selection of data to standardization direction. In addition, this research has academic implications to the knowledge management systems in terms of suggesting standardization direction by using various techniques.

Finding Frequent Itemsets based on Open Data Mining in Data Streams (데이터 스트림에서 개방 데이터 마이닝 기반의 빈발항목 탐색)

  • Chang, Joong-Hyuk;Lee, Won-Suk
    • The KIPS Transactions:PartD
    • /
    • v.10D no.3
    • /
    • pp.447-458
    • /
    • 2003
  • The basic assumption of conventional data mining methodology is that the data set of a knowledge discovery process should be fixed and available before the process can proceed. Consequently, this assumption is valid only when the static knowledge embedded in a specific data set is the target of data mining. In addition, a conventional data mining method requires considerable computing time to produce the result of mining from a large data set. Due to these reasons, it is almost impossible to apply the mining method to a realtime analysis task in a data stream where a new transaction is continuously generated and the up-to-dated result of data mining including the newly generated transaction is needed as quickly as possible. In this paper, a new mining concept, open data mining in a data stream, is proposed for this purpose. In open data mining, whenever each transaction is newly generated, the updated mining result of whole transactions including the newly generated transactions is obtained instantly. In order to implement this mechanism efficiently, it is necessary to incorporate the delayed-insertion of newly identified information in recent transactions as well as the pruning of insignificant information in the mining result of past transactions. The proposed algorithm is analyzed through a series of experiments in order to identify the various characteristics of the proposed algorithm.

VIP-targeted CRM strategies in an open market

  • Lee, Hanjun;Shim, Beomsoo;Suh, Yongmoo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.1
    • /
    • pp.229-241
    • /
    • 2015
  • Nowadays, an open-market which provides sellers and consumers a cyber place for making a transaction over the Internet has emerged as a prevalent sales channel because of convenience and relatively low price it provides. However, there are few studies about CRM strategies based on VIP consumers for an open-market even though understanding VIP consumers' behaviors in open-markets is important to increase its revenue. Therefore, we propose CRM strategies targeted on VIP customers, obtained by analyzing the transaction data of VIP customers from an open-market using data mining techniques. To that end, we first defined the VIP customers in terms of recency, frequency and monetary (RFM) values. Then, we used data mining techniques to develop a model which best classifies and identifies infiluential factors customers into VIPs or non-VIPs. We also validate each of promotion types in the aspect of effectiveness and identify association rules among the types. Then, based on the findings from these experiments, we propose strategies from the perspectives of CRM dimensions for the open-market to thrive.

Designing Cost Effective Open Source System for Bigdata Analysis (빅데이터 분석을 위한 비용효과적 오픈 소스 시스템 설계)

  • Lee, Jong-Hwa;Lee, Hyun-Kyu
    • Knowledge Management Research
    • /
    • v.19 no.1
    • /
    • pp.119-132
    • /
    • 2018
  • Many advanced products and services are emerging in the market thanks to data-based technologies such as Internet (IoT), Big Data, and AI. The construction of a system for data processing under the IoT network environment is not simple in configuration, and has a lot of restrictions due to a high cost for constructing a high performance server environment. Therefore, in this paper, we will design a development environment for large data analysis computing platform using open source with low cost and practicality. Therefore, this study intends to implement a big data processing system using Raspberry Pi, an ultra-small PC environment, and open source API. This big data processing system includes building a portable server system, building a web server for web mining, developing Python IDE classes for crawling, and developing R Libraries for NLP and visualization. Through this research, we will develop a web environment that can control real-time data collection and analysis of web media in a mobile environment and present it as a curriculum for non-IT specialists.

Learning process mining techniques based on open education platforms (개방형 e-Learning 플랫폼 기반 학습 프로세스 마이닝 기술)

  • Kim, Hyun-ah
    • The Journal of the Convergence on Culture Technology
    • /
    • v.5 no.2
    • /
    • pp.375-380
    • /
    • 2019
  • In this paper, we study learning process mining and analytic technology based on open education platform. A study on mining through personal learning history log data based on an open education platform such as MOOC which is growing in interest recently. This technology is to design and implement a learning process mining framework for discovering and analyzing meaningful learning processes and knowledge from learning history log data. Learning process mining framework technology is a technique for expressing, extracting, analyzing and visualizing the learning process to provide learners with improved learning processes and educational services.

A Sliding Window Technique for Open Data Mining over Data Streams (개방 데이터 마이닝에 효율적인 이동 윈도우 기법)

  • Chang Joong-Hyuk;Lee Won-Suk
    • The KIPS Transactions:PartD
    • /
    • v.12D no.3 s.99
    • /
    • pp.335-344
    • /
    • 2005
  • Recently open data mining methods focusing on a data stream that is a massive unbounded sequence of data elements continuously generated at a rapid rate are proposed actively. Knowledge embedded in a data stream is likely to be changed over time. Therefore, identifying the recent change of the knowledge quickly can provide valuable information for the analysis of the data stream. This paper proposes a sliding window technique for finding recently frequent itemsets, which is applied efficiently in open data mining. In the proposed technique, its memory usage is kept in a small space by delayed-insertion and pruning operations, and its mining result can be found in a short time since the data elements within its target range are not traversed repeatedly. Moreover, the proposed technique focused in the recent data elements, so that it can catch out the recent change of the data stream.

A Benchmark of Open Source Data Mining Package for Thermal Environment Modeling in Smart Farm(R, OpenCV, OpenNN and Orange) (스마트팜 열환경 모델링을 위한 Open source 기반 Data mining 기법 분석)

  • Lee, Jun-Yeob;Oh, Jong-wo;Lee, DongHoon
    • Proceedings of the Korean Society for Agricultural Machinery Conference
    • /
    • 2017.04a
    • /
    • pp.168-168
    • /
    • 2017
  • ICT 융합 스마트팜 내의 환경계측 센서, 영상 및 사양관리 시스템의 증가에도 불구하고 이들 장비에서 확보되는 데이터를 적절히 유효하게 활용하는 기술이 미흡한 실정이다. 돈사의 경우 가축의 복지수준, 성장 변화를 실시간으로 모니터링 및 예측할 수 있는 데이터 분석 및 모델링 기술 확보가 필요하다. 이를 위해선 가축의 생리적 변화 및 행동적 변화를 조기에 감지하고 가축의 복지수준을 실시간으로 감시하고 분석 및 예측 기술이 필요한데 이를 위한 대표적인 정보 통신 공학적 접근법 중에 하나가 Data mining 이다. Data mining에 대한 연구 수행에 필요한 다양한 소프트웨어 중에서 Open source로 제공이 되는 4가지 도구를 비교 분석하였다. 스마트 돈사 내에서 열환경 모델링을 목표로 한 데이터 분석에서 고려해야할 요인으로 데이터 분석 알고리즘 도출 시간, 시각화 기능, 타 라이브러리와 연계 기능 등을 중점 적으로 분석하였다. 선정된 4가지 분석 도구는 1) R(https://cran.r-project.org), 2) OpenCV(http://opencv.org), 3) OpenNN (http://www.opennn.net), 4) Orange(http://orange.biolab.si) 이다. 비교 분석을 수행한 운영체제는 Linux-Ubuntu 16.04.4 LTS(X64)이며, CPU의 클럭속도는 3.6 Ghz, 메모리는 64 Gb를 설치하였다. 개발언어 측면에서 살펴보면 1) R 스크립트, 2) C/C++, Python, Java, 3) C++, 4) C/C++, Python, Cython을 지원하여 C/C++ 언어와 Python 개발 언어가 상대적으로 유리하였다. 데이터 분석 알고리즘의 경우 소스코드 범위에서 라이브러리를 제공하는 경우 Cross-Platform 개발이 가능하여 여러 운영체제에서 개발한 결과를 별도의 Porting 과정을 거치지 않고 사용할 수 있었다. 빌트인 라이브러리 경우 순서대로 R 의 경우 가장 많은 수의 Data mining 알고리즘을 제공하고 있다. 이는 R 운영 환경 자체가 개방형으로 되어 있어 온라인에서 추가되는 새로운 라이브러리를 클라우드를 통하여 공유하기 때문인 것으로 판단되었다. OpenCV의 경우 영상 처리에 강점이 있었으며, OpenNN은 신경망학습과 관련된 라이브러리를 소스코드 레벨에서 공개한 것이 강점이라 할 수 있다. Orage의 경우 라이브러리 집합을 제공하는 것에 중점을 둔 다른 패키지와 달리 시각화 기능 및 망 구성 등 사용자 인터페이스를 통합하여 운영한 것이 강점이라 할 수 있다. 열환경 모델링에 요구되는 시간 복잡도에 대응하기 위한 부가 정보 처리 기술에 대한 연구를 수행하여 스마트팜 열환경 모델링을 실시간으로 구현할 수 있는 방안 연구를 수행할 것이다.

  • PDF

Design and Implementation of an Open Object Management System for Spatial Data Mining (공간 데이타 마이닝을 위한 개방형 객체 관리 시스템의 설계 및 구현)

  • Yun, Jae-Kwan;Oh, Byoung-Woo;Han, Ki-Joon
    • Journal of Korea Spatial Information System Society
    • /
    • v.1 no.1 s.1
    • /
    • pp.5-18
    • /
    • 1999
  • Recently, the necessity of automatic knowledge extraction from spatial data stored in spatial databases has been increased. Spatial data mining can be defined as the extraction of implicit knowledge, spatial relationships, or other knowledge not explicitly stored in spatial databases. In order to extract useful knowledge from spatial data, an object management system that can store spatial data efficiently, provide very fast indexing & searching mechanisms, and support a distributed computing environment is needed. In this paper, we designed and implemented an open object management system for spatial data mining, that supports efficient management of spatial, aspatial, and knowledge data. In order to develop this system, we used Open OODB that is a widely used object management system. However, the lark of facilities for spatial data mining in Open OODB, we extended it to support spatial data type, dynamic class generation, object-oriented inheritance, spatial index, spatial operations, etc. In addition, for further increasement of interoperability with other spatial database management systems or data mining systems, we adopted international standards such as ODMG 2.0 for data modeling, SDTS(Spatial Data Transfer Standard) for modeling and exchanging spatial data, and OpenGIS Simple Features Specification for CORBA for connecting clients and servers efficiently.

  • PDF

Model test on slope deformation and failure caused by transition from open-pit to underground mining

  • Zhang, Bin;Wang, Hanxun;Huang, Jie;Xu, Nengxiong
    • Geomechanics and Engineering
    • /
    • v.19 no.2
    • /
    • pp.167-178
    • /
    • 2019
  • Open-pit (OP) and underground (UG) mining are usually used to exploit shallow and deep ore deposits, respectively. When mine deposit starts from shallow subsurface and extends to a great depth, sequential use of OP and UG mining is an efficient and economical way to maintain mining productivity. However, a transition from OP to UG mining could induce significant rock movements that cause the slope instability of the open pit. Based on Yanqianshan Iron Mine, which was in the transition from OP to UG mining, a large-scale two-dimensional (2D) model test was built according to the similar theory. Thereafter, the UG mining was carried out to mimic the process of transition from OP to UG mining to disclose the triggered rock movement as well as to assess the associated slope instability. By jointly using three-dimensional (3D) laser scanning, distributed fiber optics, and digital photogrammetry measurement, the deformations, movements and strains of the rock slope during mining were monitored. The obtained data showed that the transition from OP to UG mining led to significant slope movements and deformations that can trigger catastrophic slope failure. The progressive movement of the slope could be divided into three stages: onset of micro-fracture, propagation of tensile cracks, and the overturning and/or sliding of slopes. The failure mode depended on the orientation of structural joints of the rock mass as well as the formation of tension cracks. This study also proved that these non-contact monitoring technologies were valid methods to acquire the interior strain and external deformation with high precision.

A Public Open Civil Complaint Data Analysis Model to Improve Spatial Welfare for Residents - A Case Study of Community Welfare Analysis in Gangdong District - (거주민 공간복지 향상을 위한 공공 개방 민원 데이터 분석 모델 - 강동구 공간복지 분석 사례를 중심으로 -)

  • Shin, Dongyoun
    • Journal of KIBIM
    • /
    • v.13 no.3
    • /
    • pp.39-47
    • /
    • 2023
  • This study aims to introduce a model for enhancing community well-being through the utilization of public open data. To objectively assess abstract notions of residential satisfaction, text data from complaints is analyzed. By leveraging accessible public data, costs related to data collection are minimized. Initially, relevant text data containing civic complaints is collected and refined by removing extraneous information. This processed data is then combined with meaningful datasets and subjected to topic modeling, a text mining technique. The insights derived are visualized using Geographic Information System (GIS) and Application Programming Interface (API) data. The efficacy of this analytical model was demonstrated in the Godeok/Gangil area. The proposed methodology allows for comprehensive analysis across time, space, and categories. This flexible approach involves incorporating specific public open data as needed, all within the overarching framework.