DOI QR코드

DOI QR Code

Two-way Interactive Algorithms Based on Speech and Motion Recognition with Generative AI Technology

생성형 AI 기술을 적용한 음성 및 모션 인식 기반 양방향 대화형 알고리즘

  • 장대성 (순천대학교 컴퓨터공학과) ;
  • 김종찬 (순천대학교 컴퓨터공학과)
  • Received : 2024.02.29
  • Accepted : 2024.04.12
  • Published : 2024.04.30

Abstract

Speech recognition and motion recognition technologies are applied and used in various smart devices, but they are composed of simple command recognition forms and are used as simple functions. Apart from simple functions for recognition data, professional command execution capabilities are required based on data learned in various fields. Research is being conducted on a system platform that provides optimal data to users using Generative AI, which is currently competing around the world, and can interact through voice recognition and motion recognition. The main technical processes designed for this study were designed using technologies such as voice and motion recognition functions, application of AI technology, and two-way communication. In this paper, two-way communication between a device and a user can be achieved by various input methods through voice recognition and motion recognition technology applied with AI technology.

음성 인식과 모션 인식 기술은 다양한 스마트 디바이스에 적용되어 사용되고 있으나, 단순한 명령어 인식 형태로 구성되어 단순 기능으로 사용되고 있다. 인식 데이터에 대한 단순 기능에서 벗어나 다양한 분야에서 학습된 데이터를 기반으로 전문적인 명령어 수행 능력이 요구되고 있다. 현재 세계적으로 경쟁이 이루어지고 있는 생성형 AI를 활용하여 사용자에게 최적의 데이터를 제공하고, 음성 인식과 모션 인식을 통해 상호작용할 수 있는 시스템 플랫폼에 대한 연구가 진행되고 있다. 본 연구를 위해 설계한 주요 기술 프로세스는 음성 및 모션 인식 기능, AI 기술 적용, 양방향 커뮤니케이션 등 기술을 이용한 설계하였다. 본 논문에서는 AI 기술을 적용한 디바이스와 음성인식과 모션 인식 기술을 통해 디바이스와 사용자 간 양방향 커뮤니케이션을 다양한 입력방식에 의해 이루어질 수 있도록 하였다.

Keywords

References

  1. Editor-in-chief of Convergence Management Review, "The Fourth Industrial Revolution to be opened by Artificial Intelligence," Convergence Management Review, vol. 21 No.-2021, pp. 1-1.
  2. Seo Ji-hoon, and Ju Ji-hong, "Korean AI learning model based on e-learning for artificial intelligence education," Artificial intelligence research paper. vol. 3, No. 2-2022, pp. 14-22. https://doi.org/10.52618/aied.2022.3.2.2
  3. Hyun Jung-woo, Kim Chul-hoo, and Gil Hyung-bae. "An analysis of trends and implications of intelligent robots and Generative AI," mechanical technology policy magazine, vol.3m no. 114 2024. pp. 1-35.
  4. Lee Hyo-seop, Shim Ho-seok. "A Study on the Design of a Metaverse Based Knowledge Management Model Using Generative AI," Journal of the Korean Society for Industrial Technology Convergence. vol. 2023. no. 2, pp. 21-32.
  5. Park Young-hyun. "A study on the use and cases of the Generative AI model (GAN) in international management following the development of AI technology," a study on trade management in Korea. vol. 2. no. 33 2024. pp. 39-54.
  6. Yoon Yeo-chan. " trends in multimodal Generative AI technology." Journal of Information Science, vol. 42. no. 1 2024. pp. 42-47.
  7. Moon Yi-sun, and Kim Hyung-seok. " biological signal and voice data - based multimodal emotion recognition," a collection of papers at a national academic conference of the society of control robot systems. vol. 2023. no. 6 2023, pp. 635-636.
  8. Lee Soo-min, Lee Mi-ran, Wei Qun, and Park Hee-jun. " a research on the development of a heart sound analysis wearable device capable of measuring multi-modal bio-signals," Journal of the Multimedia Society. vol. 25, no. 9 2022, pp. 1251-1256.
  9. Park Yang-woo, Lee Woo-jae, Kim Min-seop, Jung Myung-jin, Kang Min-jae, and Yeom Sang-ho. "Implementation of Input Devices Using Motion Recognition and Voice Recognition," Korean Society for Computer and Information, vol. 31. No. 1 2023. pp. 287-288.
  10. Park Ki-chang, Seo Sung-chae, Jung Seung-moon, Kang Im-cheol, and Kim Byung-ki. "Designing a Gesture Interface Model for GUI Application Control," Korean Content Society Journal, vol. 13. No. 1 2013. pp. 55-63.
  11. Park Yang-woo, Lee Woo-jae, Kim Min-seop, Jung Myung-jin, Kang Min-jae, and Yeom Sang-ho, "Implementation of Input Devices Using Motion Recognition and Voice Recognition," academic presentation collection of the Korean Computer Information Society. vol. 31 Nno. 1 2023, pp. 287-288.