DOI QR코드

DOI QR Code

More Than 40 Percent of Data Unnecessarily Redundant in Corporate Databases

  • Received : 2021.11.02
  • Accepted : 2021.12.03
  • Published : 2021.12.31

Abstract

Data quality issue in information systems is analyzed with focus on conceptual data modeling. Extensive investigation through triangulation of case studies is attempted to find how much extent inappropriate data modeling practices exercised in real workplace environment. It is revealed that more than 40 percent of data adversely contributed to unnecessary data redundancy, i.e., the level of data obesity is over 40 percent. Another contribution of this paper lies in excavation of all the categories of inappropriate data modeling practices, which has been previously only partially uncovered in the literature. New findings in this paper prove that the extent of inappropriate modeling is more serious that previously reported.

Keywords

References

  1. M. Thomsett, The Litt le Black Book of Project Management, American Management Association. 2010, New York, NY: AMACOM, 272 pages, 2010.
  2. L. Moss and S. Hoberman, The Importance of Data Modeling as a Foundation for Business Insight, Data Modeling and Business Insight, Realized Design, April 2008.
  3. H. Konelis, Data Modeling: art or Science?, SQLServerFast. 2008.
  4. J. Harris and S. Hoberman, Data Modeling Made Simple with Erwin Data Modeler. Technics Publications, NJ. U.S.A., 538 pages, 2020.
  5. F. Montans, F. Chinesta, R. Gomez-Bombarelli and J. Kutz, Data-Driven Modeling and Learning in Science and Engineering, Data-Based Engineering Science and Technology, Vol. 347, pp. 845-855, 2019.
  6. A. Haug, F. Zachariassen and D. van Liempd, The Cost of Poor Data Quality. Journal of Industrial Engineering and Management, Vol. 4, No. 2, pp. 168-193, 2011.
  7. C. Mancas, Conceptual Data Modeling and Database Design: A Fully Algorithmic Approach. Volume 1, Apple Academic Press, 698 pages, 2021.
  8. G. Sanders and S. Shin, Denormalization Effects on Performance of RDBMS, In Procs. 34th Hawaii International Conference on System Sciences, Vol. 3, pp. 1-9, 2001.
  9. A. Olive, Conceptual Schema-Centric Development: A Grand Challenge for Information System Research. In Procs. 17th International Conference on CAiSE, 13-17 June, In O. Pastor and J. Falcao e Cunha, (eds), LNCS, Vol. 3520, pp. 1-15, 2005.
  10. A. Tort and A. Olive, An Approach to Testing Conceptual Schemas. Data & knowledge Engineering, Vol. 69, pp. 598-618, 2010. https://doi.org/10.1016/j.datak.2010.02.002
  11. A. Fayoumi and P. Loucopoulos, Conceptual Modeling for the Design of Intelligent and Emergent Information Systems. Expert Systems with Applications, Vol. 59, pp. 174-194, 2016. https://doi.org/10.1016/j.eswa.2016.04.019
  12. K. Beck, Extreme Programming Explained: Embrace Change, 2nd (ed), Boston, USA, 224 pages, 2005, Addison-Wesley.
  13. I. Jacobson, G. Booch and G. Rumbaugh, The Unified Software Development Process, Addison-Wesley, 463 pages, 1999.
  14. C. Rich and R. Water, Automatic Programming: Myths and Prospects, IEEE Computer, Vol. 21, No. 8, pp. 40-51, 1998.
  15. R. May, Forging a Silver Bullet from the Essence of Software, IBM Systems Journal, Vol. 33, No. 1, pp. 20-45. 1994. https://doi.org/10.1147/sj.331.0020
  16. J. Sowa, Knowledge Representation: Logical, Philosophical and Computational Foundations. Brooks Cole Publishing. 594 pages, 2000.
  17. J. Mylopoulos, Representing Software Engineering Knowledge. Automated Software Engineering, Vol. 4, pp. 291-317. Kluwer Academic., 1997. https://doi.org/10.1023/A:1008627026003
  18. A. Hofstede and T. Weide, Formalisation of Techniques: Chopping down the Methodology Jungle. Information and Software Technology, Vol. 34, No. 1, pp. 57-65, 1992. https://doi.org/10.1016/0950-5849(92)90094-6
  19. F. Stolterman, B. Fitzgerald and N. Russo, Information Systems Development - Methods-in-Action, McGraw-Hill, 2002.
  20. D. Avison and G. Fitzgerald, Methodologies for Developing Information Systems: A Historical Perspective, in Procs. IFIP 19th World Computer Congress on Past and Future of Information Systems: 1976-2006 and Beyond: Information System Stream, August 21-23, Santiago, Chile, 27-38, 2006.
  21. F. Baader, D. Calvanese, D. Mcguinness and D. Nardi, The Description Logic Handbook: Theory, Implementation, and Applications, 2nd ed, Cambridge University Press, 510 pages, 2007.
  22. C. Rolland and N. Pratkash, From Conceptual Modeling to Requirement Engineering, Annals of Software Engineering, Vol. 1, pp. 151-176, 2000.
  23. E. Safan, R. Meredith and F. Burstein, Towards a Business Intelligence Systems Development Methodology: Drawing on Decision Support and Executive Information Systems, in Procs. Pacific Asia Conference on Information Systems, Association for Information Systems Library, 2016.
  24. T. Nagle, T. Redman, T and D. Sammon, Waking Up to Data Quality. The European Business Review, 12 May 2018.
  25. H. Rhee, Corporate Data Obesity: 50 Percent Redundant, Global Journal of Computer Science and Technology, Vol. 10, No. 5, pp. 7-11. 2010.
  26. F. Martinez, Bad Practices in Database Design: Are You Making These Mistakes?, Developers, 2021.
  27. W. Lemahieu, S. Broucke and B. Baesens, Principles of Database Management: The Practical Guide to Storing, Managing and Analyzing Small and Big Data. Cambridge University Press, 1807 pages, 2018.
  28. R. Hull and R. King, Semantic Database Modeling: Survey, Applications, and Research Issues, ACM Computing Surveys, Vol. 19, No. 3, pp. 201-260, 1987. https://doi.org/10.1145/45072.45073
  29. P. Chen, The Entity-Relationship Model - Toward a Unified View of Data, ACM Transactions on Database Systems, Vol. 1, No. 1, pp. 9-36, 1976. https://doi.org/10.1145/320434.320440
  30. A. Badia, Entity-Relationship Modeling Revisited, SIGMOD Record, Vol. 33, No. 1, pp. 77-82, 2004. https://doi.org/10.1145/974121.974135
  31. S. Jarvenpaa and J. Machesky, Data Analysis and Learning: An Experimental Study of Data Modeling Tools, International Journal of Man-Machine Studies, Vol. 31, pp. 367-391, 1989. https://doi.org/10.1016/0020-7373(89)90001-1
  32. H. Rhee, State-of-The-Art Worldwide Widespread ERP-borne Misuse of Data, International Journal of Innovative Trends in Engineering, Vol. 37, No. 1, pp. 47-53, 2018.
  33. C. Ordonez and J. Garcia-Garcia, Referential Integrity Quality Metrics. Decision Support Systems, Vol. 44, pp. 495-508, 2008. https://doi.org/10.1016/j.dss.2007.06.004
  34. D. Allemang and J. Hendler, Semantic Web for the Working Ontologist: Effective Modeling in RDFS and OWL, Burlington, MA: Morgan Kaufmann, pages 384, 2011.