• 제목/요약/키워드: Data quality

검색결과 20,875건 처리시간 0.045초

Feature Selection Methodology in Quality Data Mining

  • Soo, Nam-Ho;Halim, Yulius
    • 한국경영과학회:학술대회논문집
    • /
    • 대한산업공학회/한국경영과학회 2004년도 춘계공동학술대회 논문집
    • /
    • pp.698-701
    • /
    • 2004
  • In many literatures, data mining has been used as a utilization of data warehouse and data collection. The biggest utilizations of data mining are for marketing and researches. This is solely because of the data available for this field is usually in large amount. The usability of the data mining is expandable also to the production process. While the object of research of the data mining in marketing is the customers and products, data mining in the production field is object to the so called 4MlE, man, machine, materials, method (recipe) and environment. All of the elements are important to the production process which determines the quality of the product. Because the final aim of the data mining in production field is the quality of the production, this data mining is commonly recognized as quality data mining. As the variables researched in quality data mining can be hundreds or more, it could take a long time to reveal the information from the data warehouse. Feature selection methodology is proposed to help the research take the best performance in a relatively short time. The usage of available simple statistical tools in this method can help the speed of the mining.

  • PDF

유전자 알고리즘과 회귀식을 이용한 오염부하량의 예측 (Estimation of Pollutant Load Using Genetic-algorithm and Regression Model)

  • 박윤식
    • 한국환경농학회지
    • /
    • 제33권1호
    • /
    • pp.37-43
    • /
    • 2014
  • BACKGROUND: Water quality data are collected less frequently than flow data because of the cost to collect and analyze, while water quality data corresponding to flow data are required to compute pollutant loads or to calibrate other hydrology models. Regression models are applicable to interpolate water quality data corresponding to flow data. METHODS AND RESULTS: A regression model was suggested which is capable to consider flow and time variance, and the regression model coefficients were calibrated using various measured water quality data with genetic-algorithm. Both LOADEST and the regression using genetic-algorithm were evaluated by 19 water quality data sets through calibration and validation. The regression model using genetic-algorithm displayed the similar model behaviors to LOADEST. The load estimates by both LOADEST and the regression model using genetic-algorithm indicated that use of a large proportion of water quality data does not necessarily lead to the load estimates with smaller error to measured load. CONCLUSION: Regression models need to be calibrated and validated before they are used to interpolate pollutant loads, as separating water quality data into two data sets for calibration and validation.

IMPROVING SOCIAL MEDIA DATA QUALITY FOR EFFECTIVE ANALYTICS: AN EMPIRICAL INVESTIGATION BASED ON E-BDMS

  • B. KARTHICK;T. MEYYAPPAN
    • Journal of applied mathematics & informatics
    • /
    • 제41권5호
    • /
    • pp.1129-1143
    • /
    • 2023
  • Social media platforms have become an integral part of our daily lives, and they generate vast amounts of data that can be analyzed for various purposes. However, the quality of the data obtained from social media is often questionable due to factors such as noise, bias, and incompleteness. Enhancing data quality is crucial to ensure the reliability and validity of the results obtained from such data. This paper proposes an enhanced decision-making framework based on Business Decision Management Systems (BDMS) that addresses these challenges by incorporating a data quality enhancement component. The framework includes a backtracking method to improve plan failures and risk-taking abilities and a steep optimized strategy to enhance training plan and resource management, all of which contribute to improving the quality of the data. We examine the efficacy of the proposed framework through research data, which provides evidence of its ability to increase the level of effectiveness and performance by enhancing data quality. Additionally, we demonstrate the reliability of the proposed framework through simulation analysis, which includes true positive analysis, performance analysis, error analysis, and accuracy analysis. This research contributes to the field of business intelligence by providing a framework that addresses critical data quality challenges faced by organizations in decision-making environments.

A Necessity of Measurement Customer Satisfaction to NSO Products for Enhancing Quality

  • Choi, Kyung-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • 제16권4호
    • /
    • pp.781-790
    • /
    • 2005
  • Nowaday, statistical data with coherence, accuracy and timeliness are necessary to government, company and research center for decision making or research. In other words, the importance of statistical data quality is steadily increasing. Thus, in this paper, we suggest necessity of measuring customer satisfaction with NSO products for enhancing quality. And we construct measurement scale for measuring customer satisfaction based on the statistical quality indicators. Also we advise use of structural equation model in relation analysis for statistic quality elevation.

  • PDF

환경부 8일 유량.수질 자료를 이용한 SWAT 자동보정 모듈 개선 및 적용 평가 (Enhancement and Application of SWAT Auto-Calibration using Korean Ministry of Environment 8-Day Interval Flow/Water Quality data)

  • 강현우;류지철;강형식;최재완;문종필;최중대;임경재
    • 한국물환경학회지
    • /
    • 제28권2호
    • /
    • pp.247-254
    • /
    • 2012
  • Soil and Water Assessment Tool (SWAT) model has been widely used in estimation of flow and water quality at various watersheds worldwide, and it has an auto-calibration tool that could calibrate the flow and water quality data automatically from thousands of simulations. However, only continuous measured day flow/water quality data could be used in the current SWAT auto-calibration tool. Therefore, 8-day interval flow and water quality data measured nationwide by Korean Ministry of Environment (MOE) could not be used in SWAT auto-calibration even though long-term flow and water quality data in the Korean Total Maximum Daily Load (TMDL) watersheds available. In this study, current SWAT auto-calibration was modified to calibrate flow and water quality using 8-day interval flow and water quality data. As a result of this study, the Nash and Sutcliffe Efficiency (NSE) values for flow estimation using auto-calibration are 0.77 (calibration period) and 0.68 (validation period), and NSE value for water quality (T-P load) estimation (using the 8-day interval water quality data) is 0.80. The enhanced SWAT auto-calibration could be used in the estimation of continuous flow and water quality data at the outlet of TMDL watersheds and ungaged point of watersheds. In the next study, the enhanced SWAT auto-calibration will be integrated with Web based Load Duration Curve (LDC) system, and it could be suggested as methods of appraisal of TMDL in South Korea.

A Study on the Domain Discrimination Model of CSV Format Public Open Data

  • Ha-Na Jeong;Jae-Woong Kim;Young-Suk Chung
    • 한국컴퓨터정보학회논문지
    • /
    • 제28권12호
    • /
    • pp.129-136
    • /
    • 2023
  • 정부는 공공데이터 품질관리 수준평가를 진행하여 공공 개방데이터의 품질관리를 진행하고 있다. 공공 개방데이터는 XML, JSON, CSV 등 여러 오픈포맷 형태로 제공되며 CSV 형식이 대다수를 차지한다. 이러한 CSV 형식의 공공 개방데이터 품질진단 시 품질진단 담당자가 공공 개방데이터 파일의 필드명과 필드 내 데이터에 의존하여 필드 별 도메인을 판단하여 진단한다. 그러나 대량의 개방 데이터 파일을 대상으로 품질진단을 수행하기 때문에 많은 시간이 소요된다. 또한 의미 파악이 어려운 필드의 경우 품질진단의 정확성이 품질진단 담당자의 데이터 이해도 역량의 영향을 받는다. 본 논문은 필드명과 데이터 분포 통계를 이용한 CSV 형식 공공 개방데이터의 도메인 판별 모델을 제안하여 품질진단 결과가 품질진단 담당자의 역량에 좌지우지 되지 않도록 일관성과 정확성을 보장하고 진단 소요 시간 단축을 지원한다. 본 논문의 모델 적용 결과 행정안전부에서 제공하는 파일형식 개방데이터 진단도구보다 2.8% 높은 약 77%의 정답률을 보였다. 이를 통해 공공데이터 품질관리 수준진단·평가에 제안 모델 적용 시 정확성을 향상시킬 수 있을 것으로 기대한다.

회계정보 품질에 영향을 미치는 요인이 회계정보시스템 데이터 품질에 미치는 영향 (A Study on the Important Factors for Accounting Information Quality Impact on AIS Data Quality Outcomes)

  • 김경일
    • 융합정보논문지
    • /
    • 제9권12호
    • /
    • pp.24-29
    • /
    • 2019
  • AIS는 여느 조직에서든 가장 중요한 시스템 중 하나인 바, 데이터품질은 지식기반 산업사회에 있어 정보시스템의 중요한 역할을 하게 된다. 본 연구의 목적은 회계정보 품질에 영향을 미치는 중요한 요인들을 식별하고 이 요인들이 AIS 데이터 품질을 산출함에 영향을 미치는 가를 확인하고자 함에 있다. 광범위한 문헌조사를 통하여 데이터 품질에 대한 일련의 CSF를 발견하고자 하였으며, 경험적 연구를 통하여 연구목적을 달성하고자 하였다. 연구결과 AIS 데이터 품질에 영향을 미치는 가장 중요한 요인은 최고경영자 결의, AIS 본연의 특성, 입력통제로 나타났으며, AIS 데이터 품질에 영향을 미치는 요인을 검증하기 위한 회귀분석을 통하여 상기 3개 요인을 인식하는 정도와 AIS 데이터 품질을 인식하는 수준 간에 매우 유의적인 관련이 있다는 것을 발견하였다. 본 연구를 통하여 AIS를 도입하고 운영함에 있어서는 회계정보의 품질을 영향을 미치는 요인들에 대한 조직 내의 통제활동에 기여할 수 있으며 통제방안에 대한 연구가 후속으로 연구되어야 할 것이다.

GPS Research Group, Korea Astronomy Observatory

  • Park, Kwan-Dong;Kim, Ki-Nam;Park, Pil-Ho;Lim, Hyung-Chul
    • 한국우주과학회:학술대회논문집(한국우주과학회보)
    • /
    • 한국우주과학회 2002년도 한국우주과학회보 제11권2호
    • /
    • pp.35.2-35
    • /
    • 2002
  • PDF

제조기업의 데이터 품질과 재무적 성과 (Data Quality and Firm Financial Performance in the Manufacturing Industry)

  • 김정철;이춘열;이상호
    • 한국IT서비스학회지
    • /
    • 제11권sup호
    • /
    • pp.153-164
    • /
    • 2012
  • There is a belief that timely and precise data are important to decisions and the better decisions are related to better firm performance. However, empirical research investigating the effect of data quality on firm financial performance is still scarce up to recently. Current study empirically explores such an effect of data quality on firm accounting performance in the Korean manufacturing industry during 2008~2010 with secondary data. The results show that better data quality does not impact on sales and operating profit, but positively and significantly impacts on EVA(Economic Value Added). Raising the level of data quality management maturity by one level can increase EVA by about 34% in manufacturing firms.