• Title/Summary/Keyword: voice recognition

Search Result 660, Processing Time 0.026 seconds

Multi-view learning review: understanding methods and their application (멀티 뷰 기법 리뷰: 이해와 응용)

  • Bae, Kang Il;Lee, Yung Seop;Lim, Changwon
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.1
    • /
    • pp.41-68
    • /
    • 2019
  • Multi-view learning considers data from various viewpoints as well as attempts to integrate various information from data. Multi-view learning has been studied recently and has showed superior performance to a model learned from only a single view. With the introduction of deep learning techniques to a multi-view learning approach, it has showed good results in various fields such as image, text, voice, and video. In this study, we introduce how multi-view learning methods solve various problems faced in human behavior recognition, medical areas, information retrieval and facial expression recognition. In addition, we review data integration principles of multi-view learning methods by classifying traditional multi-view learning methods into data integration, classifiers integration, and representation integration. Finally, we examine how CNN, RNN, RBM, Autoencoder, and GAN, which are commonly used among various deep learning methods, are applied to multi-view learning algorithms. We categorize CNN and RNN-based learning methods as supervised learning, and RBM, Autoencoder, and GAN-based learning methods as unsupervised learning.

Design and Implementation of Real Time Device Monitoring and History Management System based on Multiple devices in Smart Factory (스마트팩토리에서 다중장치기반 실시간 장비 모니터링 및 이력관리 시스템 설계 및 구현)

  • Kim, Dong-Hyun;Lee, Jae-min;Kim, Jong-Deok
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.1
    • /
    • pp.124-133
    • /
    • 2021
  • Smart factory is a future factory that collects, analyzes, and monitors various data in real time by attaching sensors to equipment in the factory. In a smart factory, it is very important to inquire and generate the status and history of equipment in real time, and the emergence of various smart devices enables this to be performed more efficiently. This paper proposes a multi device-based system that can create, search, and delete equipment status and history in real time. The proposed system uses the Android system and the smart glass system at the same time in consideration of the special environment of the factory. The smart glass system uses a QR code for equipment recognition and provides a more efficient work environment by using a voice recognition function. We designed a system structure for real time equipment monitoring based on multi devices, and we show practicality by implementing and Android system, a smart glass system, and a web application server.

Optimal Algorithm and Number of Neurons in Deep Learning (딥러닝 학습에서 최적의 알고리즘과 뉴론수 탐색)

  • Jang, Ha-Young;You, Eun-Kyung;Kim, Hyeock-Jin
    • Journal of Digital Convergence
    • /
    • v.20 no.4
    • /
    • pp.389-396
    • /
    • 2022
  • Deep Learning is based on a perceptron, and is currently being used in various fields such as image recognition, voice recognition, object detection, and drug development. Accordingly, a variety of learning algorithms have been proposed, and the number of neurons constituting a neural network varies greatly among researchers. This study analyzed the learning characteristics according to the number of neurons of the currently used SGD, momentum methods, AdaGrad, RMSProp, and Adam methods. To this end, a neural network was constructed with one input layer, three hidden layers, and one output layer. ReLU was applied to the activation function, cross entropy error (CEE) was applied to the loss function, and MNIST was used for the experimental dataset. As a result, it was concluded that the number of neurons 100-300, the algorithm Adam, and the number of learning (iteraction) 200 would be the most efficient in deep learning learning. This study will provide implications for the algorithm to be developed and the reference value of the number of neurons given new learning data in the future.

Study on Development for Smart Door Lock and App. using Arduino and Infrared Sensor (아두이노와 적외선 센서를 이용한 스마트 도어락과 앱 개발에 대한 연구)

  • Hyeomg-Jun, Jeon;Yoon-Soo, Na;Yeo-Gyun, Youn;Kyeong-Ho, Kim;Hee-Woon, Ahn;Jae-Wook, Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.6
    • /
    • pp.1199-1206
    • /
    • 2022
  • In this paper, unlike door locks that are controlled only by the existing keypad because electronic devices can be easily operated through apps on smartphones in modern society, an app was created using app inventory so that door locks can be controlled using smartphones. Through the Bluetooth module experiment, the communication distance with the smartphone was controlled up to 10m when there were no obstacles, and through the voice recognition experiment, the recognition rate was 85% and 90% at 500~1000Hz and 1000~1500Hz, respectively, and 70% and 80% at 80dB noise. Through the results of the experimental evaluation, it was confirmed that convenience and security could be improved.

Improved Transformer Model for Multimodal Fashion Recommendation Conversation System (멀티모달 패션 추천 대화 시스템을 위한 개선된 트랜스포머 모델)

  • Park, Yeong Joon;Jo, Byeong Cheol;Lee, Kyoung Uk;Kim, Kyung Sun
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.1
    • /
    • pp.138-147
    • /
    • 2022
  • Recently, chatbots have been applied in various fields and have shown good results, and many attempts to use chatbots in shopping mall product recommendation services are being conducted on e-commerce platforms. In this paper, for a conversation system that recommends a fashion that a user wants based on conversation between the user and the system and fashion image information, a transformer model that is currently performing well in various AI fields such as natural language processing, voice recognition, and image recognition. We propose a multimodal-based improved transformer model that is improved to increase the accuracy of recommendation by using dialogue (text) and fashion (image) information together for data preprocessing and data representation. We also propose a method to improve accuracy through data improvement by analyzing the data. The proposed system has a recommendation accuracy score of 0.6563 WKT (Weighted Kendall's tau), which significantly improved the existing system's 0.3372 WKT by 0.3191 WKT or more.

Pattern recognition and AI education system design for improving achievement of non-face-to-face (e-learning) education (비대면(이러닝) 교육 성취도 향상을 위한 패턴인식 및 AI교육 시스템 설계)

  • Lee, Hae-in;Kim, Eui-Jeong;Chung, Jong-In;Kim, Chang Suk;Kang, Shin-Cheon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.329-332
    • /
    • 2022
  • This study aims to identify problems with existing e-learning content and non-face-to-face class methods, improve students' concentration, improve class achievement and educational effectiveness, and propose an artificial intelligence class system design using a web server. By using the function of face and eye tracking using OpenCV to identify attendance and concentration, and by inducing feedback through voice or message to questions asked by the instructor in the middle of class, learners relieve boredom caused by online classes and test by runner If the score is not reached, we propose an artificial intelligence education program system design that can bridge the academic gap and improve academic achievement by providing educational materials and videos for the wrong problem.

  • PDF

Pattern Recognition and AI Education System Design Proposal for Improving the Achievement of Non-face-to-face (E-Learning) Education (비대면(이러닝) 교육 성취도 향상을 위한 패턴인식 및 AI교육 시스템 설계 구축)

  • Lee, Hae-in;Kim, Eui-Jeong;Chung, Jong-In;Kim, Chang Suk;Kang, Shin-Cheon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.10a
    • /
    • pp.280-283
    • /
    • 2022
  • This study aims to identify problems with existing e-learning content and non-face-to-face class methods, improve students' concentration, improve class achievement and educational effectiveness, and propose an artificial intelligence class system design using a web server. By using the function of face and eye tracking using OpenCV to identify attendance and concentration, and by inducing feedback through voice or message to questions asked by the instructor in the middle of class, learners relieve boredom caused by online classes and test by runner If the score is not reached, we propose an artificial intelligence education program system design that can bridge the academic gap and improve academic achievement by providing educational materials and videos for the wrong problem.

  • PDF

Method of Automatically Generating Metadata through Audio Analysis of Video Content (영상 콘텐츠의 오디오 분석을 통한 메타데이터 자동 생성 방법)

  • Sung-Jung Young;Hyo-Gyeong Park;Yeon-Hwi You;Il-Young Moon
    • Journal of Advanced Navigation Technology
    • /
    • v.25 no.6
    • /
    • pp.557-561
    • /
    • 2021
  • A meatadata has become an essential element in order to recommend video content to users. However, it is passively generated by video content providers. In the paper, a method for automatically generating metadata was studied in the existing manual metadata input method. In addition to the method of extracting emotion tags in the previous study, a study was conducted on a method for automatically generating metadata for genre and country of production through movie audio. The genre was extracted from the audio spectrogram using the ResNet34 artificial neural network model, a transfer learning model, and the language of the speaker in the movie was detected through speech recognition. Through this, it was possible to confirm the possibility of automatically generating metadata through artificial intelligence.

NUI/NUX of the Virtual Monitor Concept using the Concentration Indicator and the User's Physical Features (사용자의 신체적 특징과 뇌파 집중 지수를 이용한 가상 모니터 개념의 NUI/NUX)

  • Jeon, Chang-hyun;Ahn, So-young;Shin, Dong-il;Shin, Dong-kyoo
    • Journal of Internet Computing and Services
    • /
    • v.16 no.6
    • /
    • pp.11-21
    • /
    • 2015
  • As growing interest in Human-Computer Interaction(HCI), research on HCI has been actively conducted. Also with that, research on Natural User Interface/Natural User eXperience(NUI/NUX) that uses user's gesture and voice has been actively conducted. In case of NUI/NUX, it needs recognition algorithm such as gesture recognition or voice recognition. However these recognition algorithms have weakness because their implementation is complex and a lot of time are needed in training because they have to go through steps including preprocessing, normalization, feature extraction. Recently, Kinect is launched by Microsoft as NUI/NUX development tool which attracts people's attention, and studies using Kinect has been conducted. The authors of this paper implemented hand-mouse interface with outstanding intuitiveness using the physical features of a user in a previous study. However, there are weaknesses such as unnatural movement of mouse and low accuracy of mouse functions. In this study, we designed and implemented a hand mouse interface which introduce a new concept called 'Virtual monitor' extracting user's physical features through Kinect in real-time. Virtual monitor means virtual space that can be controlled by hand mouse. It is possible that the coordinate on virtual monitor is accurately mapped onto the coordinate on real monitor. Hand-mouse interface based on virtual monitor concept maintains outstanding intuitiveness that is strength of the previous study and enhance accuracy of mouse functions. Further, we increased accuracy of the interface by recognizing user's unnecessary actions using his concentration indicator from his encephalogram(EEG) data. In order to evaluate intuitiveness and accuracy of the interface, we experimented it for 50 people from 10s to 50s. As the result of intuitiveness experiment, 84% of subjects learned how to use it within 1 minute. Also, as the result of accuracy experiment, accuracy of mouse functions (drag(80.4%), click(80%), double-click(76.7%)) is shown. The intuitiveness and accuracy of the proposed hand-mouse interface is checked through experiment, this is expected to be a good example of the interface for controlling the system by hand in the future.

KANO-TOPSIS Model for AI Based New Product Development: Focusing on the Case of Developing Voice Assistant System for Vehicles (KANO-TOPSIS 모델을 이용한 지능형 신제품 개발: 차량용 음성비서 시스템 개발 사례)

  • Yang, Sungmin;Tak, Junhyuk;Kwon, Donghwan;Chung, Doohee
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.287-310
    • /
    • 2022
  • Companies' interest in developing AI-based intelligent new products is increasing. Recently, the main concern of companies is to innovate customer experience and create new values by developing new products through the effective use of Artificial intelligence technology. However, due to the nature of products based on radical technologies such as artificial intelligence, intelligent products differ from existing products and development methods, so it is clear that there is a limitation to applying the existing development methodology as it is. This study proposes a new research method based on KANO-TOPSIS for the successful development of AI-based intelligent new products by using car voice assistants as an example. Using the KANO model, select and evaluate functions that customers think are necessary for new products, and use the TOPSIS method to derives priorities by finding the importance of functions that customers need. For the analysis, major categories such as vehicle condition check and function control elements, driving-related elements, characteristics of voice assistant itself, infotainment elements, and daily life support elements were selected and customer demand attributes were subdivided. As a result of the analysis, high recognition accuracy should be considered as a top priority in the development of car voice assistants. Infotainment elements that provide customized content based on driver's biometric information and usage habits showed lower priorities than expected, while functions related to driver safety such as vehicle condition notification, driving assistance, and security, also showed as the functions that should be developed preferentially. This study is meaningful in that it presented a new product development methodology suitable for the characteristics of AI-based intelligent new products with innovative characteristics through an excellent model combining KANO and TOPSIS.