DOI QR코드

DOI QR Code

Building a computing infrastructure in the era of data science

데이터과학 시대에 적합한 컴퓨팅 인프라 구축

  • Sookhee Choi (Department of Psychology, Woosuk University ) ;
  • Kyungsoo Han (Department of Statistics, Jeonbuk National University) ;
  • Zhe Wang (Department of Statistics, Jeonbuk National University)
  • 최숙희 (우석대학교 심리학과) ;
  • 한경수 (전북대학교 통계학과) ;
  • 왕철 (전북대학교 통계학과)
  • Received : 2023.08.20
  • Accepted : 2023.10.14
  • Published : 2024.02.29

Abstract

The popularity of data science, influenced by the trends from the United States around 2010, has significantly impacted the education of various statistics departments at domestic universities. However, it is challenging to find research papers in domestic academic journals that address the efficient teaching of data science topics in relation to computing environment. This article will discuss and propose the establishment of a suitable computing infrastructure for the education and research in statistics and data science departments in domestic universities.

2010년을 전후로 미국에서 시작된 데이터과학의 인기는 국내 대학의 여러 통계학과 교육에 큰 영향을 주고 있다. 그러나 국내 학술지에서는 데이터과학을 효율적으로 교육하기 위한 컴퓨팅 환경 구축과 활용을 다루는 연구 결과는 많지 않다. 본 논문은 국내의 통계학과 및 데이터과학 관련 학과의 교육과 연구에 적합한 컴퓨팅 인프라 구축과 활용에 관한 문제를 논의하고 해결책을 제시한다.

Keywords

References

  1. ASA (2014). "2014 Curriculum Guidelines for Undergraduate Programs in Statistical Science.", Available from: https://www.amstat.org/docs/default-source/amstat-documents/guidelines2014-11-15.pdf
  2. Betz M, Gundlach E, Hillery E, Rickus J, and Ward MD (2020). The next wave: We will all be data scientists, Statistical Analysis and Data Mining: The ASA Data Science Journal, 13, 544-547. https://doi.org/10.1002/sam.11476
  3. Boettiger C and Eddelbuettel D (2017). An introduction to rocker: Docker containers for R, The R Journal, 9, 527-536. https://doi.org/10.32614/RJ-2017-065
  4. Cetinkaya-Rundel M and Rundel C (2018). Infrastructure and tools for teaching computing throughout the statistical curriculum, The American Statistician, 72, 58-65. https://doi.org/10.1080/00031305.2017.1397549
  5. Choi S and Han K (2022). Introductory statistics textbooks: Crisis or opportunity?, The Korean Journal of Applied Statistics, 35, 105-117.
  6. Dogucu M and C, etinkaya-Rundel M (2021). Web scraping in the statistics and data science curriculum: Challenges and opportunities, Journal of Statistics and Data Science Education, 29, S112-S122. https://doi.org/10.1080/10691898.2020.1787116
  7. GAISE (2016). Guidelines for Assessment and Instruction in Statistics Education College Report 2016, Available from: https://www.amstat.org/docs/default-source/amstat-documents/gaisecollege full.pdf
  8. Hardin J, Hoerl R, Horton NJ et al. (2015). Data science in statistics curricula: Preparing students to "think with data", The American Statistician, 69, 343-353. https://doi.org/10.1080/00031305.2015.1077729
  9. Horton NJ (2013). I hear, I forget. I do, I understand: A modified Moore-method mathematical statistics course, The American Statistician, 67, 219-228. https://doi.org/10.1080/00031305.2013.849207
  10. Horton NJ, Brown ER, and Qian L (2004). Use of R as a toolbox for mathematical statistics exploration, The American Statistician, 58, 343-357. https://doi.org/10.1198/000313004X5572
  11. Kaplan D (2018). Teaching stats for data science, The American Statistician, 72, 89-96. https://doi.org/10.1080/00031305.2017.1398107
  12. Nolan D and Temple Lang D (2010). Computing in the statistics curricula, The American Statistician, 64, 97-107. https://doi.org/10.1198/tast.2010.09132
  13. Tackett M (2023). Three principles for modernizing an undergraduate regression analysis course, Journal of Statistics and Data Science Education, 31, 116-127. https://doi.org/10.1080/26939169.2023.2165989
  14. Yavuz FG and Ward MD (2020). Fostering undergraduate data science, The American Statistician, 74, 8-16. https://doi.org/10.1080/00031305.2017.1407360