• Title/Summary/Keyword: data quality

Search Result 20,907, Processing Time 0.048 seconds

Estimation of Pollutant Load Using Genetic-algorithm and Regression Model (유전자 알고리즘과 회귀식을 이용한 오염부하량의 예측)

  • Park, Youn Shik
    • Korean Journal of Environmental Agriculture
    • /
    • v.33 no.1
    • /
    • pp.37-43
    • /
    • 2014
  • BACKGROUND: Water quality data are collected less frequently than flow data because of the cost to collect and analyze, while water quality data corresponding to flow data are required to compute pollutant loads or to calibrate other hydrology models. Regression models are applicable to interpolate water quality data corresponding to flow data. METHODS AND RESULTS: A regression model was suggested which is capable to consider flow and time variance, and the regression model coefficients were calibrated using various measured water quality data with genetic-algorithm. Both LOADEST and the regression using genetic-algorithm were evaluated by 19 water quality data sets through calibration and validation. The regression model using genetic-algorithm displayed the similar model behaviors to LOADEST. The load estimates by both LOADEST and the regression model using genetic-algorithm indicated that use of a large proportion of water quality data does not necessarily lead to the load estimates with smaller error to measured load. CONCLUSION: Regression models need to be calibrated and validated before they are used to interpolate pollutant loads, as separating water quality data into two data sets for calibration and validation.

IMPROVING SOCIAL MEDIA DATA QUALITY FOR EFFECTIVE ANALYTICS: AN EMPIRICAL INVESTIGATION BASED ON E-BDMS

  • B. KARTHICK;T. MEYYAPPAN
    • Journal of applied mathematics & informatics
    • /
    • v.41 no.5
    • /
    • pp.1129-1143
    • /
    • 2023
  • Social media platforms have become an integral part of our daily lives, and they generate vast amounts of data that can be analyzed for various purposes. However, the quality of the data obtained from social media is often questionable due to factors such as noise, bias, and incompleteness. Enhancing data quality is crucial to ensure the reliability and validity of the results obtained from such data. This paper proposes an enhanced decision-making framework based on Business Decision Management Systems (BDMS) that addresses these challenges by incorporating a data quality enhancement component. The framework includes a backtracking method to improve plan failures and risk-taking abilities and a steep optimized strategy to enhance training plan and resource management, all of which contribute to improving the quality of the data. We examine the efficacy of the proposed framework through research data, which provides evidence of its ability to increase the level of effectiveness and performance by enhancing data quality. Additionally, we demonstrate the reliability of the proposed framework through simulation analysis, which includes true positive analysis, performance analysis, error analysis, and accuracy analysis. This research contributes to the field of business intelligence by providing a framework that addresses critical data quality challenges faced by organizations in decision-making environments.

Feature Selection Methodology in Quality Data Mining

  • Soo, Nam-Ho;Halim, Yulius
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2004.05a
    • /
    • pp.698-701
    • /
    • 2004
  • In many literatures, data mining has been used as a utilization of data warehouse and data collection. The biggest utilizations of data mining are for marketing and researches. This is solely because of the data available for this field is usually in large amount. The usability of the data mining is expandable also to the production process. While the object of research of the data mining in marketing is the customers and products, data mining in the production field is object to the so called 4MlE, man, machine, materials, method (recipe) and environment. All of the elements are important to the production process which determines the quality of the product. Because the final aim of the data mining in production field is the quality of the production, this data mining is commonly recognized as quality data mining. As the variables researched in quality data mining can be hundreds or more, it could take a long time to reveal the information from the data warehouse. Feature selection methodology is proposed to help the research take the best performance in a relatively short time. The usage of available simple statistical tools in this method can help the speed of the mining.

  • PDF

A Necessity of Measurement Customer Satisfaction to NSO Products for Enhancing Quality

  • Choi, Kyung-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.4
    • /
    • pp.781-790
    • /
    • 2005
  • Nowaday, statistical data with coherence, accuracy and timeliness are necessary to government, company and research center for decision making or research. In other words, the importance of statistical data quality is steadily increasing. Thus, in this paper, we suggest necessity of measuring customer satisfaction with NSO products for enhancing quality. And we construct measurement scale for measuring customer satisfaction based on the statistical quality indicators. Also we advise use of structural equation model in relation analysis for statistic quality elevation.

  • PDF

Enhancement and Application of SWAT Auto-Calibration using Korean Ministry of Environment 8-Day Interval Flow/Water Quality data (환경부 8일 유량.수질 자료를 이용한 SWAT 자동보정 모듈 개선 및 적용 평가)

  • Kang, Hyunwoo;Ryu, Jichul;Kang, Hyungsik;Choi, Jaewan;Moon, Jongpil;Choi, Joongdae;Lim, Kyoung Jae
    • Journal of Korean Society on Water Environment
    • /
    • v.28 no.2
    • /
    • pp.247-254
    • /
    • 2012
  • Soil and Water Assessment Tool (SWAT) model has been widely used in estimation of flow and water quality at various watersheds worldwide, and it has an auto-calibration tool that could calibrate the flow and water quality data automatically from thousands of simulations. However, only continuous measured day flow/water quality data could be used in the current SWAT auto-calibration tool. Therefore, 8-day interval flow and water quality data measured nationwide by Korean Ministry of Environment (MOE) could not be used in SWAT auto-calibration even though long-term flow and water quality data in the Korean Total Maximum Daily Load (TMDL) watersheds available. In this study, current SWAT auto-calibration was modified to calibrate flow and water quality using 8-day interval flow and water quality data. As a result of this study, the Nash and Sutcliffe Efficiency (NSE) values for flow estimation using auto-calibration are 0.77 (calibration period) and 0.68 (validation period), and NSE value for water quality (T-P load) estimation (using the 8-day interval water quality data) is 0.80. The enhanced SWAT auto-calibration could be used in the estimation of continuous flow and water quality data at the outlet of TMDL watersheds and ungaged point of watersheds. In the next study, the enhanced SWAT auto-calibration will be integrated with Web based Load Duration Curve (LDC) system, and it could be suggested as methods of appraisal of TMDL in South Korea.

A Study on the Domain Discrimination Model of CSV Format Public Open Data

  • Ha-Na Jeong;Jae-Woong Kim;Young-Suk Chung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.12
    • /
    • pp.129-136
    • /
    • 2023
  • The government of the Republic of Korea is conducting quality management of public open data by conducting a public data quality management level evaluation. Public open data is provided in various open formats such as XML, JSON, and CSV, with CSV format accounting for the majority. When diagnosing the quality of public open data in CSV format, the quality diagnosis manager determines and diagnoses the domain for each field based on the field name and data within the field of the public open data file. However, it takes a lot of time because quality diagnosis is performed on large amounts of open data files. Additionally, in the case of fields whose meaning is difficult to understand, the accuracy of quality diagnosis is affected by the quality diagnosis person's ability to understand the data. This paper proposes a domain discrimination model for public open data in CSV format using field names and data distribution statistics to ensure consistency and accuracy so that quality diagnosis results are not influenced by the capabilities of the quality diagnosis person in charge, and to support shortening of diagnosis time. As a result of applying the model in this paper, the correct answer rate was about 77%, which is 2.8% higher than the file format open data diagnostic tool provided by the Ministry of Public Administration and Security. Through this, we expect to be able to improve accuracy when applying the proposed model to diagnosing and evaluating the quality management level of public data.

A Study on the Important Factors for Accounting Information Quality Impact on AIS Data Quality Outcomes (회계정보 품질에 영향을 미치는 요인이 회계정보시스템 데이터 품질에 미치는 영향)

  • Kim, Kyung-Ihl
    • Journal of Convergence for Information Technology
    • /
    • v.9 no.12
    • /
    • pp.24-29
    • /
    • 2019
  • AIS is one of the most critical systems in any organization. Data quality plays a critical role in a knowledge-based economy. The objective of this study is to identify the most important factors for accounting information quality and their impact on AIS data quality outcomes. This study includes an extensive literature review to identify a set of CSF for data quality. The study uses empirical data to test the research hypothesis and resluts show that the top three most important factors that affect AIS's data quality are toop management commitmentm the nature of the AIS and input controls. The study further uses regression analysis to test the effect of those factors on AIS data quality, finding that there is a significant positive relationship between the perceived performance of the three factors and AIS data quality putcomes. To be develop to AIS data quality further study for CSF's control methodology is necessary.

Data Quality and Firm Financial Performance in the Manufacturing Industry (제조기업의 데이터 품질과 재무적 성과)

  • Kim, Jeong-Cheol;Lee, Choon Yeul;Lee, Sangho
    • Journal of Information Technology Services
    • /
    • v.11 no.sup
    • /
    • pp.153-164
    • /
    • 2012
  • There is a belief that timely and precise data are important to decisions and the better decisions are related to better firm performance. However, empirical research investigating the effect of data quality on firm financial performance is still scarce up to recently. Current study empirically explores such an effect of data quality on firm accounting performance in the Korean manufacturing industry during 2008~2010 with secondary data. The results show that better data quality does not impact on sales and operating profit, but positively and significantly impacts on EVA(Economic Value Added). Raising the level of data quality management maturity by one level can increase EVA by about 34% in manufacturing firms.