• Title/Summary/Keyword: Structured Data

Search Result 3,986, Processing Time 0.028 seconds

XBRL-Based Representation and Sharing of Decision Models (XBRL 기반의 의사결정 모형 표현과 공유)

  • Kim, Hyoung-Do;Park, Chan-Kwon;Yum, Ji-Hwan;Lee, Sung-Hoon
    • Journal of Information Technology Applications and Management
    • /
    • v.14 no.2
    • /
    • pp.117-127
    • /
    • 2007
  • Using an exchange standard, we can design an open architecture for the interchange of decision models and data. XML (eXtensible Markup Language) provides a general framework for creating such a standard. Although XML -based model representation languages such as OOSML were proposed, they are partly limited in expression capability, flexibility, generality, etc. This paper proposes a new method for expressing and sharing decision models and data based on XBRL (eXtensible Business Reporting Language), which is a XML language specialized in business reporting. We have developed a XBRL taxonomy for decision models with the concepts and relationships of a representative modeling framework, SM (Structured Modeling). The method allows for expressing data as well as decision models in a consistent and flexible manner. Diverse dependencies between components of SM models can also be affluently expressed.

  • PDF

A Study on the Integration Between Smart Mobility Technology and Information Communication Technology (ICT) Using Patent Analysis

  • Alkaabi, Khaled Sulaiman Khalfan Sulaiman;Yu, Jiwon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.6
    • /
    • pp.89-97
    • /
    • 2019
  • This study proposes a method for investigating current patents related to information communication technology and smart mobility to provide insights into future technology trends. The method is based on text mining clustering analysis. The method consists of two stages, which are data preparation and clustering analysis, respectively. In the first stage, tokenizing, filtering, stemming, and feature selection are implemented to transform the data into a usable format (structured data) and to extract useful information for the next stage. In the second stage, the structured data is partitioned into groups. The K-medoids algorithm is selected over the K-means algorithm for this analysis owing to its advantages in dealing with noise and outliers. The results of the analysis indicate that most current patents focus mainly on smart connectivity and smart guide systems, which play a major role in the development of smart mobility.

How to improve oil consumption forecast using google trends from online big data?: the structured regularization methods for large vector autoregressive model

  • Choi, Ji-Eun;Shin, Dong Wan
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.1
    • /
    • pp.41-51
    • /
    • 2022
  • We forecast the US oil consumption level taking advantage of google trends. The google trends are the search volumes of the specific search terms that people search on google. We focus on whether proper selection of google trend terms leads to an improvement in forecast performance for oil consumption. As the forecast models, we consider the least absolute shrinkage and selection operator (LASSO) regression and the structured regularization method for large vector autoregressive (VAR-L) model of Nicholson et al. (2017), which select automatically the google trend terms and the lags of the predictors. An out-of-sample forecast comparison reveals that reducing the high dimensional google trend data set to a low-dimensional data set by the LASSO and the VAR-L models produces better forecast performance for oil consumption compared to the frequently-used forecast models such as the autoregressive model, the autoregressive distributed lag model and the vector error correction model.

Radioactive waste sampling for characterisation - A Bayesian upgrade

  • Pyke, Caroline K.;Hiller, Peter J.;Koma, Yoshikazu;Ohki, Keiichi
    • Nuclear Engineering and Technology
    • /
    • v.54 no.1
    • /
    • pp.414-422
    • /
    • 2022
  • Presented in this paper is a methodology for combining a Bayesian statistical approach with Data Quality Objectives (a structured decision-making method) to provide increased levels of confidence in analytical data when approaching a waste boundary. Development of sampling and analysis plans for the characterisation of radioactive waste often use a simple, one pass statistical approach as underpinning for the sampling schedule. Using a Bayesian statistical approach introduces the concept of Prior information giving an adaptive sample strategy based on previous knowledge. This aligns more closely with the iterative approach demanded of the most commonly used structured decision-making tool in this area (Data Quality Objectives) and the potential to provide a more fully underpinned justification than the more traditional statistical approach. The approach described has been developed in a UK regulatory context but is translated to a waste stream from the Fukushima Daiichi Nuclear Power Station to demonstrate how the methodology can be applied in this context to support decision making regarding the ultimate disposal option for radioactive waste in a more global context.

A Method of Predicting Service Time Based on Voice of Customer Data (고객의 소리(VOC) 데이터를 활용한 서비스 처리 시간 예측방법)

  • Kim, Jeonghun;Kwon, Ohbyung
    • Journal of Information Technology Services
    • /
    • v.15 no.1
    • /
    • pp.197-210
    • /
    • 2016
  • With the advent of text analytics, VOC (Voice of Customer) data become an important resource which provides the managers and marketing practitioners with consumer's veiled opinion and requirements. In other words, making relevant use of VOC data potentially improves the customer responsiveness and satisfaction, each of which eventually improves business performance. However, unstructured data set such as customers' complaints in VOC data have seldom used in marketing practices such as predicting service time as an index of service quality. Because the VOC data which contains unstructured data is too complicated form. Also that needs convert unstructured data from structure data which difficult process. Hence, this study aims to propose a prediction model to improve the estimation accuracy of the level of customer satisfaction by combining unstructured from textmining with structured data features in VOC. Also the relationship between the unstructured, structured data and service processing time through the regression analysis. Text mining techniques, sentiment analysis, keyword extraction, classification algorithms, decision tree and multiple regression are considered and compared. For the experiment, we used actual VOC data in a company.

MPIL: Market prediction through image learning of unstructured and structured data (비정형, 정형 데이터의 이미지 학습을 활용한 시장예측)

  • Lee, Yoon Seon;Lee, Ju Hong;Choi, Bum Ghi;Song, Jae Won
    • Smart Media Journal
    • /
    • v.10 no.2
    • /
    • pp.16-21
    • /
    • 2021
  • Financial time series analysis plays a very important role economically and socially in modern society and is an important task affecting global development, but due to difficulties such as a lot of noise and uncertainty, financial time series analysis prediction is a difficult research topic. In this paper, we propose a market prediction method (MPIL) by converting unstructured data and structured data into images. For market prediction, it analyzes SNS and news data, which is unstructured data for n days, and converts the market data, which is structured data, to an image with the GADF algorithm, and predicts an ultra-short market that predicts the price of n+1 days through image learning. MPIL has an average accuracy of 56%, which is higher than the 50% average accuracy of the model that predicts the market with LSTM by using sentiment analysis used for existing market forecasting.

The effect of missing levels of nesting in multilevel analysis

  • Park, Seho;Chung, Yujin
    • Genomics & Informatics
    • /
    • v.20 no.3
    • /
    • pp.34.1-34.11
    • /
    • 2022
  • Multilevel analysis is an appropriate and powerful tool for analyzing hierarchical structure data widely applied from public health to genomic data. In practice, however, we may lose the information on multiple nesting levels in the multilevel analysis since data may fail to capture all levels of hierarchy, or the top or intermediate levels of hierarchy are ignored in the analysis. In this study, we consider a multilevel linear mixed effect model (LMM) with single imputation that can involve all data hierarchy levels in the presence of missing top or intermediate-level clusters. We evaluate and compare the performance of a multilevel LMM with single imputation with other models ignoring the data hierarchy or missing intermediate-level clusters. To this end, we applied a multilevel LMM with single imputation and other models to hierarchically structured cohort data with some intermediate levels missing and to simulated data with various cluster sizes and missing rates of intermediate-level clusters. A thorough simulation study demonstrated that an LMM with single imputation estimates fixed coefficients and variance components of a multilevel model more accurately than other models ignoring data hierarchy or missing clusters in terms of mean squared error and coverage probability. In particular, when models ignoring data hierarchy or missing clusters were applied, the variance components of random effects were overestimated. We observed similar results from the analysis of hierarchically structured cohort data.

New Control System Aspects for Supporting Complex Data and High Performance System

  • Yoo, Dae-Seung;Tan, Vu Van;Yi, Myeong-Jae
    • Journal of Computing Science and Engineering
    • /
    • v.2 no.4
    • /
    • pp.394-411
    • /
    • 2008
  • The data in automation and control systems can be achieved not only from different field devices but also from different OPC (OLE for Process Control) servers. However, current OPC clients can only read and decode the simple data from OPC servers. They will have some problems to acquire structured data and exchange the structured data. In addition to the large network control systems, the OPC clients can read, write, and subscribe to thousands of data points from/to OPC servers. Due to that, the most important factor for building a high performance and scalable industrial control system is the ability to transfer the process data between server and client in the most efficient and fastest way. In order to solve these problems, we propose a means to implement the OPC DA (Data Access) server supporting the OPC complex data, so that the OPC DA clients are able to read and decode any type of data from OPC servers. We also propose a method for caching the process data in large industrial control systems to overcome the limitation of performance of the pure OPC DA system. The performance analysis and discussion indicate that the proposed system has an acceptable performance and is feasible in order for applying to real-time industrial systems today.

Design of Web Service by Using OPC XML-DA and OPC Complex Data for Automation and Control Systems

  • Tan Vu Van;Yoo Dae-Sung;Yi Myeong-Jae
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.06a
    • /
    • pp.250-252
    • /
    • 2006
  • Web technologies are gaining increased importance in automation and control systems. However, the choice of Web technologies depends on the use cases in the application environment. In industrial systems, the data can be got not only from many different field systems and devices but also from different OPC (OLE for Process Control) Servers. Current OPC Client might be able to read simple data from OPC Server, but there are some problems to get structured data and to exchange structured information between collaborating applications. Therefore, OPC Foundation has defined interfaces to OPC XML-DA (OPC XML Data Access) and OPC Complex Data that aim to solve those problems. The OPC XML-DA can facilitate the exchange of plant data across the internet, and upwards into the enterprise domain. In addition, the OPC Complex Data will extend the OPC DA specification to allow the OPC Client to read and decode any type of data from measurement and control systems on the plant floor. This paper will describe the concept of OPC XML-DA and OPC Complex Data. And then it proposes a mechanism to implement the OPC Complex Data into OPC XML-DA Server. Additionally, the paper also discusses the security aspects.

  • PDF

A Comparison of Performance Between MSSQL Server and MongoDB for Telco Subscriber Data Management (통신 가입자 데이터 관리를 위한 MSSQL Server와 NoSQL MongoDB의 성능 비교)

  • Nichie, Aaron;Koo, Heung-Seo
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.65 no.3
    • /
    • pp.469-476
    • /
    • 2016
  • Relational Database Management Systems have become de facto database model among most developers and users since the inception of Data Science. From IoT devices, sensors, social media and other sources, data is generated in structured, semi-structured and unstructured formats, in huge volumes, thereby the difficulty of data management greatly increases. Organizations that collect large amounts of data are increasingly turning to non relational databases - NoSQL databases. In this paper, through experiments with real field data, we demonstrate that MongoDB, a document-based NoSQL database, is a better alternative for building a Telco Subscriber Data Management System which hitherto is mainly built with Relational Database Management Systems. We compare the existing system in various phases of data flow with our proposed system powered by MongoDB. We show how various workloads at some phases of the existing system were either completely removed or significantly simplified on the new system. Based on experiment results, using MongoDB for managing telco subscriber data turned out to offer performance better than the existing system built with MSSQL Server.