• Title/Summary/Keyword: data-driven model

Search Result 668, Processing Time 0.028 seconds

Denoising Self-Attention Network for Mixed-type Data Imputation (혼합형 데이터 보간을 위한 디노이징 셀프 어텐션 네트워크)

  • Lee, Do-Hoon;Kim, Han-Joon;Chun, Joonghoon
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.11
    • /
    • pp.135-144
    • /
    • 2021
  • Recently, data-driven decision-making technology has become a key technology leading the data industry, and machine learning technology for this requires high-quality training datasets. However, real-world data contains missing values for various reasons, which degrades the performance of prediction models learned from the poor training data. Therefore, in order to build a high-performance model from real-world datasets, many studies on automatically imputing missing values in initial training data have been actively conducted. Many of conventional machine learning-based imputation techniques for handling missing data involve very time-consuming and cumbersome work because they are applied only to numeric type of columns or create individual predictive models for each columns. Therefore, this paper proposes a new data imputation technique called 'Denoising Self-Attention Network (DSAN)', which can be applied to mixed-type dataset containing both numerical and categorical columns. DSAN can learn robust feature expression vectors by combining self-attention and denoising techniques, and can automatically interpolate multiple missing variables in parallel through multi-task learning. To verify the validity of the proposed technique, data imputation experiments has been performed after arbitrarily generating missing values for several mixed-type training data. Then we show the validity of the proposed technique by comparing the performance of the binary classification models trained on imputed data together with the errors between the original and imputed values.

Development of Local Stem Volume Table for Pinus densiflora S. et Z. Using Tree Stem Taper Model (수간곡선 모델을 이용한 소나무의 지방별 수간재적표 개발)

  • Kang, Jin-Taek;Son, Yeong-Mo;Kim, So-Won;Lee, Sun-Jeoung;Park, Hyun
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.16 no.4
    • /
    • pp.327-335
    • /
    • 2014
  • Current volume tables might underestimate or overestimate the volumes of individual trees in a specific region because the tables were made using the data from broad regions within South Korea. Therefore, to solve this problem, this study was conducted to develop local stem volume tables reflecting the local growth pattern and properties using stem taper equations in the regions of Hongcheon and Yeongju. We developed the local stem volume table for Pinus densiflora, which is the widely planted species in South Korea. To derive the most suitable taper equation for estimating the stem volume of region, three models of Max & Burkhart, Kozak and Parresol et al. were applied and their fitness were statistically analyzed by using the Fitness Index, Bias, and Standard Error of Bias. The result showed that there is a significant difference among the three models, and the Fitness Index of the Kozak model was highest compared to the other models. Therefore, the Kozak model was chosen for generating stem taper equation and stem volume tables for P. densiflora. The result from the developed stem volume tables of each region was compared to the current stem volume tables with driven by the data of tree growth obtained throughout the nation. The result showed that there is a significant difference (0.000< ${\alpha}=0.05$) in two regions, Hongcheon and Yeongju, and also there is a significant difference (0.000< ${\alpha}=0.05$) between the two regions.

An Empirical Investigation into the Role of Core-Competency Orientation and IT Outsourcing Process Management Capability (핵심역량 지향성과 프로세스 관리역량이 IT 아웃소싱 성과에 미치는 연구)

  • Kim, Yong-Jin;Nam, Ki-Chan;Song, Jae-Ki;Koo, Chul-Mo
    • Asia pacific journal of information systems
    • /
    • v.17 no.3
    • /
    • pp.131-146
    • /
    • 2007
  • Recently, the role of IT service providers has been enlarged from managing a single function or system to reconstructing entire information management processes in new ways to contribute to shareholder value across the enterprise. This movement toward extensive and complex outsourcing agreements has been driven by the assumption that outsourcing information technology functions is a reliable approach to maximizing resource productivity. Hiring external IT service providers to manage part or all of its information-related services helps a firm focus on its core business and provides better services to its clients, thus obtaining sustainable competitive advantage. This practice of focusing on the strategic aspect of outsourcing is referred to as strategic sourcing where the focus is capability sourcing, not procurement. Given the importance of the strategic outsourcing, however, to our knowledge, there is little empirical research on the relationship between the strategic outsourcing orientation and outsourcing performance. Moreover, there is little research on the factor that makes the strategic outsourcing effective. This study is designed to investigate the relationship between strategic IT outsourcing orientation and IT outsourcing performance and the process through which strategic IT outsourcing orientation influences outsourcing performance, Based on the framework of strategic orientation-performance and core competence based management, this study first identifies core competency orientation as a proper strategic orientation pertinent to IT outsourcing and IT outsourcing process management capability as the mediator to affect IT outsourcing performance. The proposed research model is then tested with a sample of 200 firms. The findings of this study may contribute to the literature in two ways. First, it draws on the strategic orientation - performance framework in developing its research model so that it can provide a new perspective to the well studied phenomena. This perspective allows practitioners and researchers to look at outsourcing from an angle that emphasizes the strategic decision making to outsource its IT functions. Second, by separating the concept of strategic orientation and outsourcing process management capability, this study provides practices with insight into how the strategic orientation can work effectively to achieve an expected result. In addition, the current study provides a basis for future studies that examine the factors affecting IT outsourcing performance with more controllable factors such as IT outsourcing process management capability rather than external hard-to-control factors including trust and relationship management. This study investigates the major factors that determine IT outsourcing success. Based on strategic orientation and core competency theories, we develop the proposed research model to investigate the relationship between core competency orientation and IT outsourcing performance and the mediating role of IT outsourcing process management capability on IT outsourcing performance. The model consists of two independent variables (core-competency-orientation and IT outsourcing process management capability), and two dependent variables (outsourced task complexity and IT outsourcing performance). Comprehensive data collection was conducted through an outsourcing association. The survey data were analyzed using a structural analysis method. IT outsourcing process management capability was found to mediate the effect of core competency orientation on both outsourced task complexity and IT outsourcing performance. Further analysis and findings are discussed.

Diagnosis of Valve Internal Leakage for Ship Piping System using Acoustic Emission Signal-based Machine Learning Approach (선박용 밸브의 내부 누설 진단을 위한 음향방출신호의 머신러닝 기법 적용 연구)

  • Lee, Jung-Hyung
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.28 no.1
    • /
    • pp.184-192
    • /
    • 2022
  • Valve internal leakage is caused by damage to the internal parts of the valve, resulting in accidents and shutdowns of the piping system. This study investigated the possibility of a real-time leak detection method using the acoustic emission (AE) signal generated from the piping system during the internal leakage of a butterfly valve. Datasets of raw time-domain AE signals were collected and postprocessed for each operation mode of the valve in a systematic manner to develop a data-driven model for the detection and classification of internal leakage, by applying machine learning algorithms. The aim of this study was to determine whether it is possible to treat leak detection as a classification problem by applying two classification algorithms: support vector machine (SVM) and convolutional neural network (CNN). The results showed different performances for the algorithms and datasets used. The SVM-based binary classification models, based on feature extraction of data, achieved an overall accuracy of 83% to 90%, while in the case of a multiple classification model, the accuracy was reduced to 66%. By contrast, the CNN-based classification model achieved an accuracy of 99.85%, which is superior to those of any other models based on the SVM algorithm. The results revealed that the SVM classification model requires effective feature extraction of the AE signals to improve the accuracy of multi-class classification. Moreover, the CNN-based classification can be a promising approach to detect both leakage and valve opening as long as the performance of the processor does not degrade.

Analysis of Precipitation Characteristics of Regional Climate Model for Climate Change Impacts on Water Resources (기후변화에 따른 수자원 영향 평가를 위한 Regional Climate Model 강수 계열의 특성 분석)

  • Kwon, Hyun-Han;Kim, Byung-Sik;Kim, Bo-Kyung
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.28 no.5B
    • /
    • pp.525-533
    • /
    • 2008
  • Global circulation models (GCMs) have been used to study impact of climate change on water resources for hydrologic models as inputs. Recently, regional circulation models (RCMs) have been used widely for climate change study, but the RCMs have been rarely used in the climate change impacts on water resources in Korea. Therefore, this study is intended to use a set of climate scenarios derived by RegCM3 RCM ($27km{\times}27km$), which is operated by Korea Meteorological Administration. To begin with, the RCM precipitation data surrounding major rainfall stations are extracted to assess validation of the scenarios in terms of reproducing low frequency behavior. A comprehensive comparison between observation and precipitation scenario is performed through statistical analysis, wavelet transform analysis and EOF analysis. Overall analysis confirmed that the precipitation data driven by RegCM3 shows capabilities in simulating hydrological low frequency behavior and reproducing spatio-temporal patterns. However, it is found that spatio-temporal patterns are slightly biased and amplitudes (variances) from the RCMs precipitation tend to be lower than the observations. Therefore, a bias correction scheme to correct the systematic bias needs to be considered in case the RCMs are applied to water resources assessment under climate change.

Label Assignment Schemes for MPLS Traffic Engineering (MPLS 트래픽 엔지니어링을 위한 레이블 할당 방법)

  • 이영석;이영석;옥도민;최양희;전병천
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.25 no.8A
    • /
    • pp.1169-1176
    • /
    • 2000
  • In this paper, label assignment schemes considering the IP flow model for the efficient MPLS traffic engineering are proposed and evaluated. Based on the IP flow model, the IP flows are classified into transient flows and base flows. Base flows, which last for a long time, transmit data in high bit rate, and be composed of many packets, have good implications for the MPLS traffic engineering, because they usually cause network congestion. To make use of base flows for the MPLS traffic engineering, we propose two base flow classifiers and label assignment schemes where transient flows are allocated to the default LSPs and base flows to explicit LSPs. Proposed schemes are based on the traffic-driven label triggering method combined with a routing tabel. The first base flow classifier uses both flow size in packet counts and routing entries, and the other one, extending the dynamic X/Y flow classifier, is based on a cut-through ratio. Proposed schemes are shown to minimize the number of labels, not degrading the total cut-through ratio.

  • PDF

Inferring Undiscovered Public Knowledge by Using Text Mining-driven Graph Model (텍스트 마이닝 기반의 그래프 모델을 이용한 미발견 공공 지식 추론)

  • Heo, Go Eun;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.31 no.1
    • /
    • pp.231-250
    • /
    • 2014
  • Due to the recent development of Information and Communication Technologies (ICT), the amount of research publications has increased exponentially. In response to this rapid growth, the demand of automated text processing methods has risen to deal with massive amount of text data. Biomedical text mining discovering hidden biological meanings and treatments from biomedical literatures becomes a pivotal methodology and it helps medical disciplines reduce the time and cost. Many researchers have conducted literature-based discovery studies to generate new hypotheses. However, existing approaches either require intensive manual process of during the procedures or a semi-automatic procedure to find and select biomedical entities. In addition, they had limitations of showing one dimension that is, the cause-and-effect relationship between two concepts. Thus;this study proposed a novel approach to discover various relationships among source and target concepts and their intermediate concepts by expanding intermediate concepts to multi-levels. This study provided distinct perspectives for literature-based discovery by not only discovering the meaningful relationship among concepts in biomedical literature through graph-based path interference but also being able to generate feasible new hypotheses.

Reader Emulation System for Accessing Sensor Device Through EPCglobal Reader Protocol (EPCglobal 리더 프로토콜을 통한 센서장치 접근을 위한 리더 에뮬레이션 시스템)

  • Choi, Seung-Hyuk;Kim, Tae-Yong;Kwon, Oh-Heum;Song, Ha-Joo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.8
    • /
    • pp.842-852
    • /
    • 2010
  • RFID applications use tags to identify objects, but recent applications tend to include diverse sensor devices such as light, temperature, and humidity sensors as well. RFID tag information is usually processed via the event driven model. However sensor devices are usually accessed via the functional call model. Therefore application developers have to deal with mixed data access models and device dependent interface functions. In this paper, we propose a sensor reader emulator that provides a consistent access interface to sensor devices regardless of the types of devices. SRE provides a more efficient way of developing RFID applications by providing a single application programmer's view to RFID tags and sensor devices. In applications where tags are fixed to a place, SRE can replace expensive sensor tags and sensor readers with inexpensive sensor devices reducing the total cost while providing the same functionality.

A Reexamination on the Influence of Fine-particle between Districts in Seoul from the Perspective of Information Theory (정보이론 관점에서 본 서울시 지역구간의 미세먼지 영향력 재조명)

  • Lee, Jaekoo;Lee, Taehoon;Yoon, Sungroh
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.2
    • /
    • pp.109-114
    • /
    • 2015
  • This paper presents a computational model on the transfer of airborne fine particles to analyze the similarities and influences among the 25 districts in Seoul by quantifying a time series data collected from each district. The properties of each district are driven with the model of a time series of the fine particle concentrations, and the calculation of edge-based weights are carried out with the transfer entropies between all pairs of the districts. We applied a modularity-based graph clustering technique to detect the communities among the 25 districts. The result indicates the discovered clusters correspond to a high transfer-entropy group among the communities with geographical adjacency or high in-between traffic volumes. We believe that this approach can be further extended to the discovery of significant flows of other indicators causing environmental pollution.

Study on Concurrent Simulation Technique of Matlab CMDPS and A CarSim Base Full Car Model (매트랩 CMDPS와 카심 기반 완전차량모델의 동시시뮬레이션 기술에 관한 연구)

  • Jang, Bongchoon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.4
    • /
    • pp.1555-1560
    • /
    • 2013
  • The Column type Motor Driven Power Steering(CMDPS) systems are generally equipped among passenger vehicles ensuring better vehicle safety and fuel economy. In general to analyze systems and to develop a controller a full vehicle model from CarSim developed by Mechanical Simulation Incorporation interacting with MDPS control algorithm from Matlab Simulink was concurrently simulated. This paper describes the development of concurrent simulation technique in detail for analyzing Matlab Simulink MDPS control system with a dynamic vehicle system because the specific method has not been revealed in detail. The steering wheel angle input was evaluated and well compared with proving ground experimental data. The comparisons from concurrent simulation show an effective way to develop and validate the control algorithm. This concurrent simulation capability will be efficiently used for CMDPS performance evaluation and logic tuning as well as for vehicle handling performance.