• 제목/요약/키워드: Data quality analysis

검색결과 8,987건 처리시간 0.037초

Saliency Score-Based Visualization for Data Quality Evaluation

  • Kim, Yong Ki;Lee, Keon Myung
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제15권4호
    • /
    • pp.289-294
    • /
    • 2015
  • Data analysts explore collections of data to search for valuable information using various techniques and tricks. Garbage in, garbage out is a well-recognized idiom that emphasizes the importance of the quality of data in data analysis. It is therefore crucial to validate the data quality in the early stage of data analysis, and an effective method of evaluating the quality of data is hence required. In this paper, a method to visually characterize the quality of data using the notion of a saliency score is introduced. The saliency score is a measure comprising five indexes that captures certain aspects of data quality. Some experiment results are presented to show the applicability of proposed method.

데이터 정제와 그래프 분석을 이용한 대용량 공정데이터 분석 방법 (An Analysis Method of Superlarge Manufacturing Process Data Using Data Cleaning and Graphical Analysis)

  • 박재홍;변재현
    • 품질경영학회지
    • /
    • 제30권2호
    • /
    • pp.72-85
    • /
    • 2002
  • Advances in computer and sensor technology have made it possible to obtain superlarge manufacturing process data in real time, letting us extract meaningful information from these superlarge data sets. We propose a systematic data analysis procedure which field engineers can apply easily to manufacture quality products. The procedure consists of data cleaning and data analysis stages. Data cleaning stage is to construct a database suitable for statistical analysis from the original superlarge manufacturing process data. In the data analysis stage, we suggest a graphical easy-to-implement approach to extract practical information from the cleaned database. This study will help manufacturing companies to achieve six sigma quality.

IMPROVING SOCIAL MEDIA DATA QUALITY FOR EFFECTIVE ANALYTICS: AN EMPIRICAL INVESTIGATION BASED ON E-BDMS

  • B. KARTHICK;T. MEYYAPPAN
    • Journal of applied mathematics & informatics
    • /
    • 제41권5호
    • /
    • pp.1129-1143
    • /
    • 2023
  • Social media platforms have become an integral part of our daily lives, and they generate vast amounts of data that can be analyzed for various purposes. However, the quality of the data obtained from social media is often questionable due to factors such as noise, bias, and incompleteness. Enhancing data quality is crucial to ensure the reliability and validity of the results obtained from such data. This paper proposes an enhanced decision-making framework based on Business Decision Management Systems (BDMS) that addresses these challenges by incorporating a data quality enhancement component. The framework includes a backtracking method to improve plan failures and risk-taking abilities and a steep optimized strategy to enhance training plan and resource management, all of which contribute to improving the quality of the data. We examine the efficacy of the proposed framework through research data, which provides evidence of its ability to increase the level of effectiveness and performance by enhancing data quality. Additionally, we demonstrate the reliability of the proposed framework through simulation analysis, which includes true positive analysis, performance analysis, error analysis, and accuracy analysis. This research contributes to the field of business intelligence by providing a framework that addresses critical data quality challenges faced by organizations in decision-making environments.

국방 C5ISR 분야 품질문제의 빅데이터 분석 및 예측 모델에 대한 연구 (A Study on the Big Data Analysis and Predictive Models for Quality Issues in Defense C5ISR)

  • 허형조;고수진;백승현
    • 품질경영학회지
    • /
    • 제51권4호
    • /
    • pp.551-571
    • /
    • 2023
  • Purpose: The purpose of this study is to propose useful suggestions by analyzing the causal effect relationship between the failure rate of quality and the process variables in the C5ISR domain of the defense industry. Methods: The collected data through the in house Systems were analyzed using Big data analysis. Data analysis between quality data and A/S history data was conducted using the CRISP-DM(Cross-Industry Standard Process for Data Mining) analysis process. Results: The results of this study are as follows: After evaluating the performance of candidate models for the influence of inspection data and A/S history data, logistic regression was selected as the final model because it performed relatively well compared to the decision tree with an accuracy of 82%/67% and an AUC of 0.66/0.57. Based on this model, we estimated the coefficients using 'R', a data analysis tool, and found that a specific variable(continuous maximum discharge current time) had a statistically significant effect on the A/S quality failure rate and it was analysed that 82% of the failure rate could be predicted. Conclusion: As the first case of applying big data analysis to quality issues in the defense industry, this study confirms that it is possible to improve the market failure rates of defense products by focusing on the measured values of the main causes of failures derived through the big data analysis process, and identifies improvements, such as the number of data samples and data collection limitations, to be addressed in subsequent studies for a more reliable analysis model.

SVM 기반 자동 품질검사 시스템에서 상관분석 기반 데이터 선정 연구 (Study on Correlation-based Feature Selection in an Automatic Quality Inspection System using Support Vector Machine (SVM))

  • 송동환;오영광;김남훈
    • 대한산업공학회지
    • /
    • 제42권6호
    • /
    • pp.370-376
    • /
    • 2016
  • Manufacturing data analysis and its applications are getting a huge popularity in various industries. In spite of the fast advancement in the big data analysis technology, however, the manufacturing quality data monitored from the automated inspection system sometimes is not reliable enough due to the complex patterns of product quality. In this study, thus, we aim to define the level of trusty of an automated quality inspection system and improve the reliability of the quality inspection data. By correlation analysis and feature selection, this paper presents a method of improving the inspection accuracy and efficiency in an SVM-based automatic product quality inspection system using thermal image data in an auto part manufacturing case. The proposed method is implemented in the sealer dispensing process of the automobile manufacturing and verified by the analysis of the optimal feature selection from the quality analysis results.

탐색적 자료 분석 및 연관규칙 분석을 활용한 잔류농약 부적합 농업인 유형 분석 (Pattern Analysis of Nonconforming Farmers in Residual Pesticides using Exploratory Data Analysis and Association Rule Analysis)

  • 김상웅;박은수;조현정;홍성희;손병철;홍지화
    • 품질경영학회지
    • /
    • 제49권1호
    • /
    • pp.81-95
    • /
    • 2021
  • Purpose: The purpose of this study was to analysis pattern of nonconforming farmers who is one of the factors of unconformity in residual pesticides. Methods: Pattern analysis of nonconforming farmers were analyzed through convergence of safety data and farmer's DB data. Exploratory data analysis and association rule analysis were used for extracting factors related to unconformity. Results: The results of this study are as follows; regarding the exploratory data analysis, it was found that factors of farmers influencing unconformity in residual pesticides by total 9 factors; sampling time, gender, age, cultivation region, farming career, agricultural start form, type of agriculture, cultivation area, classification of agricultural products. Regarding the association rule analysis, non-conformity association rules were found over the past three years. There was a difference in the pattern of nonconforming farmers depending on the cultivation period. Conclusion: Exploratory data analysis and association rule analysis will be useful tools to establish more efficient and economical safety management plan for agricultural products.

Evaluation of Water Quality Using Multivariate Statistic Analysis with Optimal Scaling

  • Kim, Sang-Soo;Jin, Hyun-Guk;Park, Jong-Soo;Cho, Jang-Sik
    • Journal of the Korean Data and Information Science Society
    • /
    • 제16권2호
    • /
    • pp.349-357
    • /
    • 2005
  • Principal component analysis(PCA) was carried out to evaluate the water quality with the monitering data collected from 1997 to 2003 along the coastal area of Ulsan, Korea. To enhance evaluation and to complement descriptive power of traditional PCA, optimal scaling was applied to transform the original data into optimally scaled data. Cluster analysis was also applied to classify the monitering stations according to their characteristics of water quality.

  • PDF

시계열 네트워크분석을 통한 데이터품질 연구경향 및 산업연관 분석 (Trend of Research and Industry-Related Analysis in Data Quality Using Time Series Network Analysis)

  • 장경애;이광석;김우제
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제5권6호
    • /
    • pp.295-306
    • /
    • 2016
  • 본 연구는 데이터품질과 관련된 선행연구의 메타정보를 활용하여 연구경향을 분석하고 이를 통해서 산업계의 흐름을 예측하기 위한 목적의 연구이다. 다양한 분야에서 연구경향을 분석하려는 시도는 이어져 왔으나, 데이터품질 영역은 그 범위가 방대하여 선행 연구자료에 대한 분석을 수행하기 어려웠다. 본 연구는 Web of Science 색인DB에 수록된 최근 10년간의 연구 메타데이터를 수집하여 텍스트 마이닝, 사회연결망 분석기법을 활용한 시계열 네트워크 분석을 수행하였다. 연구주제 분석 결과, 수학 및 전산 생물학, 화학, 건강관리 과학 및 서비스, 생화학 및 분자 생물학, 운영 연구 및 경영 과학, 의료정보학은 연구비율이 감소하고 있었고, 환경, 수자원, 지질학, 계측기 및 계측의 연구비율은 증가하고 있었다. 또한 사회연결망 분석 결과 데이터품질 연구에서는 분석, 알고리즘, 네트워크의 주제가 중앙성이 높은 중요한 주제로 나타났으며, 이미지와 모델, 센서, 최적화가 데이터품질에서 중요한 주제로 등장하는 추세를 보였다. 데이터품질의 산업과 연관관계 분석 결과는 기술, 산업, 건강, 유틸리티, 고객서비스가 연관성이 높은 산업으로 나타났다. 본 연구의 결과는 데이터품질 연구의 패턴을 분석하고 산업과 연관관계를 찾는 데이터품질 관련 연구자 뿐아니라 산업계에도 유용한 자료로 활용되리라 판단된다.

텍스트 감정분석을 이용한 IT 서비스 품질요소 분석 (Analysis of IT Service Quality Elements Using Text Sentiment Analysis)

  • 김홍삼;김종수
    • 산업경영시스템학회지
    • /
    • 제43권4호
    • /
    • pp.33-40
    • /
    • 2020
  • In order to satisfy customers, it is important to identify the quality elements that affect customers' satisfaction. The Kano model has been widely used in identifying multi-dimensional quality attributes in this purpose. However, the model suffers from various shortcomings and limitations, especially those related to survey practices such as the data amount, reply attitude and cost. In this research, a model based on the text sentiment analysis is proposed, which aims to substitute the survey-based data gathering process of Kano models with sentiment analysis. In this model, from the set of opinion text, quality elements for the research are extracted using the morpheme analysis. The opinions' polarity attributes are evaluated using text sentiment analysis, and those polarity text items are transformed into equivalent Kano survey questions. Replies for the transformed survey questions are generated based on the total score of the original data. Then, the question-reply set is analyzed using both the original Kano evaluation method and the satisfaction index method. The proposed research model has been tested using a large amount of data of public IT service project evaluations. The result shows that it can replace the existing practice and it promises advantages in terms of quality and cost of data gathering. The authors hope that the proposed model of this research may serve as a new quality analysis model for a wide range of areas.

데이터 품질 분석 모델(DQnA)을 이용한 융합적·적응적 품질 분석에 관한 연구 (A study on Convergent & Adaptive Quality Analysis using DQnA model)

  • 김용원
    • 한국융합학회논문지
    • /
    • 제5권4호
    • /
    • pp.21-25
    • /
    • 2014
  • 현재 대부분의 기업들이 정보기술을 기반으로 정보 시스템을 이용한 데이터 분석 기법을 활용하고 있다. 이러한 데이터 분석은 기업의 다양한 의사결정에 영향을 미치는 데이터의 품질 평가에 주목하고 있다. 이는 데이터 품질 평가가 기업의 효과적인 운영뿐만 아니라 여러 부분에서 중요한 역할을 하기 때문이다. 본 연구에서는 현재 다양하게 연구되고 있는 데이터 품질 평가 모델에 관하여 기술하고, 이를 기반으로 데이터 품질 분석에 활용되고 있는 융합적이며, 적응적 모델인 DQnA 모델에 관하여 서술하고, 이를 활용한 품질 분석 방법에 관하여 논의하고자 한다.