데이타 품질 측정 도구

A Data Quality Measuring Tool

  • 양자영 (이화여자대학교 컴퓨터학과) ;
  • 최병주 (이화여자대학교 컴퓨터학과)
  • 발행 : 2003.06.01

초록

소프트웨어 제품을 실행시키기 위해 요구되는 데이타의 품질은 소프트웨어 품질에 영향을 미치고 있다 특히 대용량의 데이타로부터 의미 있는 지식을 추출하는 지식공학 시스템에서 원시 데이터의 품질을 보장하는 일은 매우 중요하다. 본 논문에서는 데이타의 측정 도구인 DAQUM도구를 설계 구현하였다. 본 논문에서는 DAQUM도구의 설계 및 구현에 관한 주요내용을 기술하고, 사례연구를 통하여 DAQUM도구가 오류데이타를 검색하여 데이타 사용자 관점에서 데이타의 품질을 정량적으로 측정 가능하도록 함을 나타낸다. DAQUM도구는 데이타의 품질 측정 및 품질 제어를 가능하게 함으로써 데이타를 주로 처리하는 소프트웨어 제품의 품질 향상에 기여할 수 있다.

Quality of the software is affected by quality of data required for operating the actual software. Especially, it is important that assure the quality of data in a knowledge-engineering system that extracts the meaningful knowledge from stored data. In this paper, we developed DAQUM tool that can measure quality of data. This paper shows: 1) main contents for implement of DAQUM tool; 2) detection of dirty data via DAQUM tool through case study and measurement of data quality which is quantifiable from end-user's point of view. DAQUM tool will greatly contribute to improving quality of software product that processes mainly the data through control and measurement of data quality.

키워드

참고문헌

  1. ISO/IEC 14598-1,2,3,4,5,6, JTC 1 SC 7 Documents, 1999
  2. Won Kim et 'A Component-Based Knowledge Engineering Architecture,' JOOP, vol.12, no.6, pp 40-48, 1999
  3. Won Kim et al. 'The Chamois component-based knowledge engineering framework,' IEEE Computer, May 2002
  4. Won Kim et al. 'The Chamois Re-configurable Data-Mining Architecture,' Journal of Object Technology, pp21-34 , June 2002
  5. D. Ballou and G.K. Tayi 'Enhancing Data Quality in Data Warehouse Environments,' Communications of the ACM, vol. 42, no. 1, pp. 73-78, Jan. 1999 https://doi.org/10.1145/291469.291471
  6. Amir Parssian, Sumit Sarkar, Varghese S. Jacob, 'Assessing data quality for information products,' Proceeding of the 20th international conference on Information Systems, p.428-433, January, 1999
  7. Won Kim, Byoung-Ju Choi, Eui-Kyeong Hong, Soo-Kyung Kim, Doheon Lee, 'A Taxonomy of Dirty Data,' Data Mining and Knowledge Discovery, 2002, Acceptedfor publication https://doi.org/10.1023/A:1021564703268
  8. Richard Y. Wang 'A Product Perspective on Total Data Quality Management,' Communication of the ACM, vol. 41, no. 2, pp. 58-65, Feb. 1998 https://doi.org/10.1145/269012.269022
  9. Ballou, D. P. and Pazer, H.L 'Modeling Data and process Quality in multi-input, multi-output information systems,' Management Science 31, pp 150-162, Feb. 1998 https://doi.org/10.1287/mnsc.31.2.150
  10. R. Wang, V. Storey and C. Firth 'A Framework for Analysis of Data Quality Research,' IEEE Transactions on Knowledge and Engineering, vol. 7, no. 4, pp. 623-640, Aug. 1995 https://doi.org/10.1109/69.404034
  11. Wang et al. 'Data Quality in context,' Communication of the ACM, vol. 40, no 5, May 1997 https://doi.org/10.1145/253769.253804
  12. Ken Orr, 'Data Quality and System Theory,' Communications of the ACM, vol.41 , no.2 Feb. 1998 https://doi.org/10.1145/269012.269023