DOI QR코드

DOI QR Code

Building Korean Science Textbook Corpus (K-STeC) for research of Scientific Language in Education

교육용 과학언어 연구를 위한 범용 자료로서 과학교과서 말뭉치 K-STeC(Korean Science Textbook Corpus) 구축

  • Received : 2018.06.08
  • Accepted : 2018.07.20
  • Published : 2018.08.31

Abstract

In this study, the texts of science textbooks of the past 20 years were collected in order to systematically carry out researches on scientific languages and scientific terms that have not been noticed in science education. We have collected all the science textbooks from elementary school to high school in the 6th curriculum, the 7th curriculum, and the 2009 revised curriculum, and constructed a corpus comprising of 132 textbooks in total. Sequentially, a raw corpus, a morphological annotated corpus, and a semantic annotated corpus of science terms, were constructed. The final constructed science textbook corpus was named K-STeC (Korean Science Textbook Corpus). K-STeC is a semantic annotated corpus with semantic classification and classification of scientific terms, together with meta information of bibliographic information such as curriculum, subject, grade, and publisher, location information such as chapter, section, lesson, page, and sentence, and structure information such as main, inquiry activities, reference materials, and titles. Throughout the three-year study period, a new research method was created by integrating the know-how of the three fields of linguistic informatics, computer science and science education, and a large number of experts were put in to produce labor-intensive results. This paper introduces new research methodologies and outcomes by looking at the whole research process and methods, and discusses the possibility of future development of scientific language research and how to use the results.

본 연구에서는 과학교육에서 그 동안 주목받지 못했던 과학언어 및 과학용어에 대한 연구를 체계적으로 수행하기 위한 목적으로 지난 20년간의 과학교과서 텍스트를 한 자리에 모아 과학교과서 말뭉치를 구축함으로써 다각도로 분석 가능한 형태의 언어 자원을 생성하였다. 말뭉치 구축 대상 자료는 6차 교육과정, 7차 교육과정, 2009 개정교육과정의 초등학교에서부터 고등학교까지 모든 과학교과서를 수집하고 이 가운데 두 개의 출판사에 해당하는 132권에 대한 말뭉치를 구축하였다. 원시말뭉치, 형태주석 말뭉치, 용어주석 말뭉치의 총 3단계로 구축하였다. 최종적으로 구축된 과학교과서 말뭉치를 K-STeC(Korea - Science Textbook Corpus)이라 명명하였다. K-STeC은 과학용어에 대한 의미 구분과 분야가 표지된 의미 주석 말뭉치로서 교육과정, 과목, 학년, 출판사의 서지 정보와 대단원, 중단원, 소단원의 단원 정보, 페이지, 문장번호의 위치 정보와 함께 본문, 탐구활동, 참고자료, 제목 등의 텍스트 구조 정보를 메타정보로 마크업 하였다. 총 3년여에 걸친 연구 기간 동안 언어정보학, 컴퓨터공학, 과학교육학의 세 분야 전문가들의 노하우를 융합하여 새로운 연구 방법을 창출하였고, 다수의 전문 인력들이 투입되어 노동집약적 결과물을 내었다. 본 원고에서는 전체적인 연구 절차와 방법을 조망함으로써 새로운 연구 방법론 및 결과물을 소개하고 향후 과학언어 연구의 발전 가능성 및 결과물의 활용방안에 대해 논의하였다.

Keywords

References

  1. Darian, S. G. (2003). Understanding the language of science. University of Texas Press.
  2. Fang, Z. (2006). The language demands of science reading in middle school. International Journal of Science Education, 28(5), 491-520. https://doi.org/10.1080/09500690500339092
  3. Ford, A., & Peat, F. D. (1988). The role of language in science. Foundations of Physics, 18, 1233. https://doi.org/10.1007/BF01889434
  4. Ham, J., Lee, J., & Shin, D. (2011). Middle school students’ feelings of easiness and understanding of earth science terminology. Journal of Research in Curriculum Instruction, 15(4), 1045-1060. https://doi.org/10.24231/rici.2011.15.4.1045
  5. Jaipal, K. (2001). English second language students in a grade 11 biology class: Relationships between language and learning. Proceeding of 2001 Annual Meeting of the American Educational Research Association, ED 453690.
  6. Jeon, S. H. (2003). (21st Sejong project) Application of corpus. The National Institute of the Korean Language.
  7. Kang, B. M. (2011). Language, computer, and corpus linguistics. Seoul: Korea University Press.
  8. Kwak, Y. (2013). Corpus quality control for high-quality language resource construction. Doctoral Dissertation, Yonsei University.
  9. Kwak, Y. (2017). Exploration of features of Korean eighth grade students' achievement and curriculum matching in TIMSS 2015 earth science. Journal of the Korean Association for Science Education, 37(1), 9-16. https://doi.org/10.14697/JKASE.2017.37.1.0009
  10. Kwak, Y., Kim, C. J., Lee, Y. R., & Jeong D. S. (2006). Investigation on elementary and secondary students' interest in science. Journal of the Korean Earth Science Society, 27(3), 260-268.
  11. Martin, M. O., Mullis, I. V. S., Foy, P., & Hooper, M. (2016). TIMSS 2015 International results in science. IEA.
  12. Maskill, R. (1988). Logical language, natural strategies and the teaching of science. International Journal of Science Education, 10(5), 485-495. https://doi.org/10.1080/0950069880100502
  13. McEnery, A. & Hardie, A.(2012). Corpus linguistics: Theory, method and practice. Cambridge: Cambridge University Press.
  14. Merzyn, G. (1987). The language of school science. International Journal of Science Education, 9(4), 483-489. https://doi.org/10.1080/0950069870090406
  15. Mikolov, T., Yih, W., & Zweig, G. (2013). Linguistic regularities in continuous space word representations. HLT-NAACL, 746-751.
  16. Miller, J. (2009). Teaching refugee learners with interrupted education in science: Vocabulary, literacy and pedagogy. International Journal of Science Education, 31(4), 571-592. https://doi.org/10.1080/09500690701744611
  17. Nam, K. S. (2008). Middle school students' learning difficulty caused by scientific terminology and ways to solve it via writing using scientific terminology. Doctoral Dissertation, Seoul National University.
  18. Park, Y., Gwon, S., & Yun, E. (2015). Research for improvement of science textbook(Physics) through inducing change of instruction. Korea Foundation for the Advancement of Science & Creativity, Research Report.
  19. Reeves, C. (2005). The language of science. Routledge.
  20. Shaw, J. (2002). Linguistically responsive science teaching. Electronic Magazine of Multicultural Education, 4(1), 24.
  21. Shin, J. C., & Ock, C. Y. (2012). A stage transition model for Korean part-of-speech and homograph tagging. Software and Application, 39(11), 889-901.
  22. The National Institute of the Korean Language (2008). Pyojun Korean unabridged dictionary, The National Institute of the Korean Language.
  23. Wellington J., & Osborne, J. (2001). Language and literacy in science education. Open University Press.
  24. Yore, L. D., Hand, B., Goldman, S. R., & Hildbrand, G. M. (2004). New directions in language and science education research. Reading Research Quarterly, 39(3), 347-352.
  25. Yun, E., & Park, Y. (2013a). Research on science teacher's perception of teaching science terminology. Journal of the Korean Association for Science Education, 33(7), 1343-1353. https://doi.org/10.14697/jkase.2013.33.7.1343
  26. Yun, E., & Park, Y. (2013b). Analysis of physics terminology in science textbooks for teaching science words. Journal of the Korean Association for Science Education, 33(4), 735-750. https://doi.org/10.14697/jkase.2013.33.4.735
  27. Yun, E., & Park, Y. (2014). Consistency among the glossary for a textbook, the Glossary of Physics Terminology and the Pyojun Korean Unabridged Dictionary on the basis of the words used in middle-school science textbooks. Sae Mulli, 64, 180-187. https://doi.org/10.3938/NPSM.64.180

Cited by

  1. 과학교과서에 제시된 과학용어에 대한 명시적 및 암시적 교육 사례 분석 vol.39, pp.6, 2019, https://doi.org/10.14697/jkase.2019.39.6.767