• Title/Summary/Keyword: Data quality metrics

Search Result 140, Processing Time 0.024 seconds

Comparing Fault Prediction Models Using Change Request Data for a Telecommunication System

  • Park, Young-Sik;Yoon, Byeong-Nam;Lim, Jae-Hak
    • ETRI Journal
    • /
    • v.21 no.3
    • /
    • pp.6-15
    • /
    • 1999
  • Many studies in the software reliability have attempted to develop a model for predicting the faults of a software module because the application of good prediction models provides the optimal resource allocation during the development period. In this paper, we consider the change request data collected from the field test of the software module that incorporate a functional relation between the faults and some software metrics. To this end, we discuss the general aspect if regression method, the problem of multicollinearity and the measures of model evaluation. We consider four possible regression models including two stepwise regression models and two nonlinear models. Four developed models are evaluated with respect to the predictive quality.

  • PDF

Optimizing Performance of Wind Turbines

  • Kusiak, Andrew
    • 한국신재생에너지학회:학술대회논문집
    • /
    • 2009.06a
    • /
    • pp.467-470
    • /
    • 2009
  • Variable loads along the drive-train are attributed to frequent failures of gears, bearings, and other components. Wind parameters cannot be controlled and therefore any turbine load-reducing remedies must be established based on proper insights into the wind-turbine interactions. A novel control concept to performance optimization of wind turbines is presented. This proposed concept is based on analysis of the turbine status reflected in the SCADA data. Modern computational techniques are used to optimize performance of a wind turbine from tree basic perspectives: drive-train, power output, and power quality. The proposed approach demonstrates that gains in the metrics representing the three perspectives and the corresponding control goals can be significantly improved for any wind turbine. The solution is applicable different turbine types operating in different wind regimes, e.g., winds of different speeds and variability. Simple and transparent parameters allow an operator to determine a balance between the operations and maintenance, technical, business objectives. The proposed modeling framework was embedded in software. The software tool has been tested on the data collected from 1.5 MW wind turbines.

  • PDF

Image-to-Image Translation with GAN for Synthetic Data Augmentation in Plant Disease Datasets

  • Nazki, Haseeb;Lee, Jaehwan;Yoon, Sook;Park, Dong Sun
    • Smart Media Journal
    • /
    • v.8 no.2
    • /
    • pp.46-57
    • /
    • 2019
  • In recent research, deep learning-based methods have achieved state-of-the-art performance in various computer vision tasks. However, these methods are commonly supervised, and require huge amounts of annotated data to train. Acquisition of data demands an additional costly effort, particularly for the tasks where it becomes challenging to obtain large amounts of data considering the time constraints and the requirement of professional human diligence. In this paper, we present a data level synthetic sampling solution to learn from small and imbalanced data sets using Generative Adversarial Networks (GANs). The reason for using GANs are the challenges posed in various fields to manage with the small datasets and fluctuating amounts of samples per class. As a result, we present an approach that can improve learning with respect to data distributions, reducing the partiality introduced by class imbalance and hence shifting the classification decision boundary towards more accurate results. Our novel method is demonstrated on a small dataset of 2789 tomato plant disease images, highly corrupted with class imbalance in 9 disease categories. Moreover, we evaluate our results in terms of different metrics and compare the quality of these results for distinct classes.

Institutional Applications of Eclipse Scripting Programming Interface to Clinical Workflows in Radiation Oncology

  • Kim, Hojin;Kwak, Jungwon;Jeong, Chiyoung;Cho, Byungchul
    • Progress in Medical Physics
    • /
    • v.28 no.3
    • /
    • pp.122-128
    • /
    • 2017
  • Eclipse Scripting Application Programming Interface (ESAPI) was devised to enhance the efficiency in such treatment related workflows as contouring, treatment planning, plan quality measure, and data-mining by communicating with the treatment planning system (TPS). It is provided in the form of C# programming based toolbox, which could be modified to fit into the clinical applications. The Scripting program, however, does not offer all potential functionalities that the users intend to develop. The shortcomings can be overcome by combining the Scripting programming with user-executable program on Windows or Linux. The executed program has greater freedom in implementation, which could strengthen the ability and availability of the Scripting on the clinical applications. This work shows the use of the Scripting programming throughout the simple modification of the given toolbox. Besides, it presents the implementation of combining both Scripting and user-executed programming based on MATLAB, applied to automated dynamic MLC wedge and FIF treatment planning procedure for promoting the planning efficiency.

Jaccard Index Reflecting Time-Context for User-based Collaborative Filtering

  • Soojung Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.10
    • /
    • pp.163-170
    • /
    • 2023
  • The user-based collaborative filtering technique, one of the implementation methods of the recommendation system, recommends the preferred items of neighboring users based on the calculations of neighboring users with similar rating histories. However, it fundamentally has a data scarcity problem in which the quality of recommendations is significantly reduced when there is little common rating history. To solve this problem, many existing studies have proposed various methods of combining Jaccard index with a similarity measure. In this study, we introduce a time-aware concept to Jaccard index and propose a method of weighting common items with different weights depending on the rating time. As a result of conducting experiments using various performance metrics and time intervals, it is confirmed that the proposed method showed the best performance compared to the original Jaccard index at most metrics, and that the optimal time interval differs depending on the type of performance metric.

Improvement of GOCI-II Ground System for Monitoring of Level-1 Data Quality (천리안 해양위성 2호 Level-1 영상의 품질관리를 위한 지상국 시스템 개선)

  • Sun-Ju Lee;Kum-Hui Oh;Gm-Sil Kang;Woo-Chang Choi;Jong-Kuk Choi;Jae-Hyun Ahn
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_2
    • /
    • pp.1529-1539
    • /
    • 2023
  • The data from Geostationary Ocean Color Imager-II (GOCI-II), which observes the color of the sea to monitor marine environments, undergoes various correction processes in the ground station system, producing data from Raw to Level-2 (L2). Quality issues arising at each processing stage accumulate step by step, leading to an amplification of errors in the satellite data. To address this, improvements were made to the GOCI-II ground station system to measure potential optical quality and geolocation accuracy errors in the Level-1A/B (L1A/B) data. A newly established Radiometric and Geometric Performance Assessment Module (RGPAM) now measures five optical quality factors and four geolocation accuracy factors in near real-time. Testing with GOCI-II data has shown that RGPAM's functions, including data processing, display and download of measurement results, work well. The performance metrics obtained through RGPAM are expected to serve as foundational data for real-time radiometric correction model enhancements, assessment of L1 data quality consistency, and the development of reprocessing strategies to address identified issues related to the GOCI-II detector's sensitivity degradation.

A Study on logistics Performance Index andSupply Chain Tracking Data during the Covid-19 Pandemic (Covid-19 팬데믹시기 물류성과지수와 공급망 추적 데이터에 대한 고찰)

  • Ahn, TaeKun
    • Journal of Korea Port Economic Association
    • /
    • v.39 no.3
    • /
    • pp.191-210
    • /
    • 2023
  • The Covid-19 pandemic has had a significant impact on global logistics and supply chains, leading to major discrepancies in logistics performance across countries worldwide. Through an examination of logistics performance index and supply chain tracking data, this study aimed to identify the changes in global supply chains and logistics environments during the pandemic. The analysis of the logistics performance index showed that overall, countries around the world, especially developed nations, showed improvements in metrics such as customs and border management efficiency, the quality of trade and transport infrastructure, capability and quality of logistics services, and cargo tracking abilities. However, the competitive pricing feasibility of international transportation and the on-time delivery frequency of goods saw a decline due to the pandemic's effects. The supply chain tracking data revealed that ports in Asian countries demonstrated high processing efficiency. In contrast, the U.S. and European countries took comparatively more time. Particularly for air cargo, parcels, and express shipments, the U.S. showed relatively longer processing times, leading to logistical delays. In conclusion, during the Covid-19 pandemic, Asian countries maintained relatively high efficiency in their logistics and trade environments. Conversely, the U.S. and some European countries showed delays and decreased efficiency in various metrics. In the future, efforts should be made to address delays and congestion, namely, the deceleration of logistics processes.

Software Reliability Prediction of Grouped Failure Data Using Variant Models of Cascade-Correlation Learning Algorithm (변형된 캐스케이드-상관 학습 알고리즘을 적용한 그룹 고장 데이터의 소프트웨어 신뢰도 예측)

  • Lee, Sang-Un;Park, Jung-Yang
    • The KIPS Transactions:PartD
    • /
    • v.8D no.4
    • /
    • pp.387-392
    • /
    • 2001
  • This Many software projects collect grouped failure data (failures in some failure interval or in variable time interval) rather than individual failure times or failure count data during the testing or operational phase. This paper presents the neural network (NN) modeling for grouped failure data that is able to predict cumulative failures in the variable future time. The two variant models of cascade-correlation learning (CasCor) algorithm are presented. Suggested models are compared with other well-known NN models and statistical software reliability growth models (SRGMs). Experimental results show that the suggested models show better predictability.

  • PDF

Comparative analysis of wavelet transform and machine learning approaches for noise reduction in water level data (웨이블릿 변환과 기계 학습 접근법을 이용한 수위 데이터의 노이즈 제거 비교 분석)

  • Hwang, Yukwan;Lim, Kyoung Jae;Kim, Jonggun;Shin, Minhwan;Park, Youn Shik;Shin, Yongchul;Ji, Bongjun
    • Journal of Korea Water Resources Association
    • /
    • v.57 no.3
    • /
    • pp.209-223
    • /
    • 2024
  • In the context of the fourth industrial revolution, data-driven decision-making has increasingly become pivotal. However, the integrity of data analysis is compromised if data quality is not adequately ensured, potentially leading to biased interpretations. This is particularly critical for water level data, essential for water resource management, which often encounters quality issues such as missing values, spikes, and noise. This study addresses the challenge of noise-induced data quality deterioration, which complicates trend analysis and may produce anomalous outliers. To mitigate this issue, we propose a noise removal strategy employing Wavelet Transform, a technique renowned for its efficacy in signal processing and noise elimination. The advantage of Wavelet Transform lies in its operational efficiency - it reduces both time and costs as it obviates the need for acquiring the true values of collected data. This study conducted a comparative performance evaluation between our Wavelet Transform-based approach and the Denoising Autoencoder, a prominent machine learning method for noise reduction.. The findings demonstrate that the Coiflets wavelet function outperforms the Denoising Autoencoder across various metrics, including Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and Mean Squared Error (MSE). The superiority of the Coiflets function suggests that selecting an appropriate wavelet function tailored to the specific application environment can effectively address data quality issues caused by noise. This study underscores the potential of Wavelet Transform as a robust tool for enhancing the quality of water level data, thereby contributing to the reliability of water resource management decisions.

Realtime No-Reference Quality-Assessment Over Packet Video Networks (패킷 비디오 네트워크상의 실시간 무기준법 동영상 화질 평가방법)

  • Sung, Duk-Gu;Kim, Yo-Han;Hana, Jung-Hyun;Shin, Ji-Tae
    • Journal of Broadcast Engineering
    • /
    • v.14 no.4
    • /
    • pp.387-396
    • /
    • 2009
  • No-Reference video-quality assessments are divided into two kinds of metrics based on decoding pixel domain or the bitstream one. Traditional full-/reduced- reference methods have difficulty to be deployed as realtime video transmission because it has problems of additional data, complexity, and assessment accuracy. This paper presents simple and highly accurate no-reference video-quality assessment in realtime video transmission. Our proposed method uses quantization parameter, motion vector, and information of transmission error. To evaluate performance of the proposed algorithm, we perform subjective test of video quality with the ITU-T P.910 Absolute Category Rating(ACR) method and compare our proposed algorithm with the subjective quality assessment method. Experimental results show the proposed quality metric has a high correlation (85%) in terms of subjective quality assessment.