DOI QR코드

DOI QR Code

AI-based system for automatically detecting food risk information from news data

뉴스 데이터로부터 식품위해정보 자동 추출을 위한 인공지능 기술

  • 백유진 (한국과학기술원 인공지능 대학원) ;
  • 이지현 (한국과학기술원 인공지능 대학원) ;
  • 김남희 (김남희연구소) ;
  • 이헌주 (켐아이넷) ;
  • 주재걸 (한국과학기술원 인공지능 대학원)
  • Received : 2021.07.19
  • Accepted : 2021.08.11
  • Published : 2021.09.30

Abstract

A recent advance in communication technologies accelerates the spread of food safety issues once presented by the news media. To respond to those safety issues and take steps in a timely manner, automatically detecting related information from the news data matters. This work presents an AI-based system that detects risk information within a food-related news article. Experts in food safety areas participated in labeling risk information from the food-related news articles; we acquired 43,527 articles in which food names and risk information are marked as labels. Based on the news document, our system automatically detects food names and risk information by analyzing similarities between words within a text by leveraging learned word embedding vectors. Our AI-based system shows higher detection accuracy scores over a non-AI rule-based system: achieving an absolute gain of +32.94% in F1 for the food name category and +41.53% for the risk information category.

Keywords

Acknowledgement

본 연구는 2020년도 식품의약품안전처의 연구개발비(20162미래기374)로 수행되었으며 이에 감사드립니다.

References

  1. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed representations of words and phrases and their compositionality. Vol. 2, pp. 3111-3119. In: Proceedings of the 26th International Conference on Neural Information Processing Systems (2013)
  2. Pennington J, Socher R, Manning C. Glove: global vectors for word representation. pp. 1532-1543. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (2014)
  3. Joulin A, Grave E, Bojanowski P, Douze M, Jegou H, Mikolov T. Fasttext. zip: compressing text classification models. arXiv preprint arXiv:1612.03651. (2016)
  4. Van der Maaten L, Hinton G. Visualizing data using t-SNE. JMLR. 9: 2579-2605 (2008)
  5. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Polosukhin I. pp. 6000-6010. In: Proceedings of the 31th International Conference on Neural Information Processing Systems (2017)
  6. Devlin J, Chang MW, Lee K. Bert: pre-training of deep bidirectional transformers for language understanding. Vol. 1, pp. 4171-4186. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2019)
  7. Tagtog. 택톡 웹기반 텍스트 레이블링 툴. https://www.tagtog.net/