• Title/Summary/Keyword: Voice and Image Recognition

Search Result 74, Processing Time 0.058 seconds

A Study on the Recognition of Face Based on CNN Algorithms (CNN 알고리즘을 기반한 얼굴인식에 관한 연구)

  • Son, Da-Yeon;Lee, Kwang-Keun
    • Korean Journal of Artificial Intelligence
    • /
    • v.5 no.2
    • /
    • pp.15-25
    • /
    • 2017
  • Recently, technologies are being developed to recognize and authenticate users using bioinformatics to solve information security issues. Biometric information includes face, fingerprint, iris, voice, and vein. Among them, face recognition technology occupies a large part. Face recognition technology is applied in various fields. For example, it can be used for identity verification, such as a personal identification card, passport, credit card, security system, and personnel data. In addition, it can be used for security, including crime suspect search, unsafe zone monitoring, vehicle tracking crime.In this thesis, we conducted a study to recognize faces by detecting the areas of the face through a computer webcam. The purpose of this study was to contribute to the improvement in the accuracy of Recognition of Face Based on CNN Algorithms. For this purpose, We used data files provided by github to build a face recognition model. We also created data using CNN algorithms, which are widely used for image recognition. Various photos were learned by CNN algorithm. The study found that the accuracy of face recognition based on CNN algorithms was 77%. Based on the results of the study, We carried out recognition of the face according to the distance. Research findings may be useful if face recognition is required in a variety of situations. Research based on this study is also expected to improve the accuracy of face recognition.

Recognition of the Korean Character Using Phase Synchronization Neural Oscillator

  • Lee, Joon-Tark;Kwon, Yang-Bum
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.28 no.2
    • /
    • pp.347-353
    • /
    • 2004
  • Neural oscillator can be applied to oscillator systems such as analysis of image information, voice recognition and etc, Conventional learning algorithms(Neural Network or EBPA(Error Back Propagation Algorithm)) are not proper for oscillatory systems with the complicate input patterns because of its too much complex structure. However, these problems can be easily solved by using a synchrony characteristic of neural oscillator with PLL(phase locked loop) function and a simple Hebbian learning rule, Therefore, in this paper, it will introduce an technique for Recognition of the Korean Character using Phase Synchronization Neural Oscillator and will show the result of simulation.

Recognition of the Korean Alphabet using Phase Synchronization of Neural Oscillator

  • Lee, Joon-Tark;Bum, Kwon-Yong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.1
    • /
    • pp.93-99
    • /
    • 2004
  • Neural oscillator can be applied to oscillatory systems such as analyses of image information, voice recognition and etc. Conventional EBPA (Error back Propagation Algorithm) is not proper for oscillatory systems with the complicate input`s patterns because of its tedious training procedures and sluggish convergence problems. However, these problems can be easily solved by using a synchrony characteristic of neural oscillator with PLL(Phase Locked Loop) function and by using a simple Hebbian learning rule. Therefore, in this paper, a technique for Recognition of the Korean Alphabet using Phase Synchronized Neural Oscillator was introduced.

Monosyllable Speech Recognition through Facial Movement Analysis (안면 움직임 분석을 통한 단음절 음성인식)

  • Kang, Dong-Won;Seo, Jeong-Woo;Choi, Jin-Seung;Choi, Jae-Bong;Tack, Gye-Rae
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.63 no.6
    • /
    • pp.813-819
    • /
    • 2014
  • The purpose of this study was to extract accurate parameters of facial movement features using 3-D motion capture system in speech recognition technology through lip-reading. Instead of using the features obtained through traditional camera image, the 3-D motion system was used to obtain quantitative data for actual facial movements, and to analyze 11 variables that exhibit particular patterns such as nose, lip, jaw and cheek movements in monosyllable vocalizations. Fourteen subjects, all in 20s of age, were asked to vocalize 11 types of Korean vowel monosyllables for three times with 36 reflective markers on their faces. The obtained facial movement data were then calculated into 11 parameters and presented as patterns for each monosyllable vocalization. The parameter patterns were performed through learning and recognizing process for each monosyllable with speech recognition algorithms with Hidden Markov Model (HMM) and Viterbi algorithm. The accuracy rate of 11 monosyllables recognition was 97.2%, which suggests the possibility of voice recognition of Korean language through quantitative facial movement analysis.

Design of a User authentication Protocol Using Face Information (얼굴정보를 이용한 사용자 인증 프로토콜 설계)

  • 지은미
    • Journal of the Korea Computer Industry Society
    • /
    • v.5 no.1
    • /
    • pp.157-166
    • /
    • 2004
  • Consequently substantial research has been done on the development of the bio-metric recognition method as well as technical research in the field of authentication. As a method of bio-metric recognition, personal and unique information such as fingerprints, voice, face, Iris, hand-geometry and vein-pattern are used. The face image system in bio-metric recognition and information authentication reduces the denial response from the users because it is a non-contact system the face image system operates through a PC camera attached to a computer base this makes the system economically viable as well as user friendly. Conversely, the face image system is very sensitive to illumination, hair style and appearance and consequently creates recognition errors easily, therefore we must build a stable authentication system which is not too sensitive to changes in appearance and light. In this study, I proposed user authentication protocol to serve a confidentiality and integrity and to obtain a least Equal Error Rate to minimize the wrong authentication rate when it authenticates the user.

  • PDF

Untact-based elevator operating system design using deep learning of private buildings (프라이빗 건물의 딥러닝을 활용한 언택트 기반 엘리베이터 운영시스템 설계)

  • Lee, Min-hye;Kang, Sun-kyoung;Shin, Seong-yoon;Mun, Hyung-jin
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.161-163
    • /
    • 2021
  • In an apartment or private building, it is difficult for the user to operate the elevator button in a similar situation with luggage in both hands. In an environment where human contact must be minimized due to a highly infectious virus such as COVID-19, it is inevitable to operate an elevator based on untact. This paper proposes an operating system capable of operating the elevator by using the user's voice and image processing through the user's face without pressing the elevator button. The elevator can be operated to a designated floor without pressing a button by detecting the face of a person entering the elevator by detecting the person's face from the camera installed in the elevator, matching the information registered in advance. When it is difficult to recognize a person's face, it is intended to enhance the convenience of elevator use in an untouched environment by controlling the floor of the elevator using the user's voice through a microphone and automatically recording access information.

  • PDF

Emergency situations Recognition System Using Multimodal Information (멀티모달 정보를 이용한 응급상황 인식 시스템)

  • Kim, Young-Un;Kang, Sun-Kyung;So, In-Mi;Han, Dae-Kyung;Kim, Yoon-Jin;Jung, Sung-Tae
    • Proceedings of the IEEK Conference
    • /
    • 2008.06a
    • /
    • pp.757-758
    • /
    • 2008
  • This paper aims to propose an emergency recognition system using multimodal information extracted by an image processing module, a voice processing module, and a gravity sensor processing module. Each processing module detects predefined events such as moving, stopping, fainting, and transfer them to the multimodal integration module. Multimodal integration module recognizes emergency situation by using the transferred events and rechecks it by asking the user some question and recognizing the answer. The experiment was conducted for a faint motion in the living room and bathroom. The results of the experiment show that the proposed system is robust than previous methods and effectively recognizes emergency situations at various situations.

  • PDF

A Smart Closet Using Deep Learning and Image Recognition for the Blind (시각장애인을 위한 딥러닝과 이미지인식을 이용한 스마트 옷장)

  • Choi, So-Hee;Kim, Ju-Ha;Oh, Jae-Dong;Kong, Ki-Sok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.6
    • /
    • pp.51-58
    • /
    • 2020
  • The blind people have difficulty living an independent clothing life. The furniture and home appliance are adding AI or IoT with the recent growth of the smart appliance market. To support the independent clothing life of the blind, this paper suggests a smart wardrobe with closet control function, voice recognition function and clothes information recognition using CNN algorithm. The number of layers of the model was changed and Maxpooling was adjusted to create the model to increase accuracy in the process of recognizing clothes. Early Stopping Callback option is applied to ensure learning accuracy when creating a model. We added Dropout to prevent overfitting. The final model created by this process can be found to have 80 percent accuracy in clothing recognition.

Product Nutrition Information System for Visually Impaired People (시각 장애인을 위한 상품 영양 정보 안내 시스템)

  • Jonguk Jung;Je-Kyung Lee;Hyori Kim;Yoosoo Oh
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.18 no.5
    • /
    • pp.233-240
    • /
    • 2023
  • Nutrition information about food is written on the label paper, which is very inconvenient for visually impaired people to recognize. In order to solve the inconvenience of visually impaired people with nutritional information recognition, this paper proposes a product nutrition information guide system for visually impaired people. In the proposed system, user's image data input through UI, and object recognition is carried out through YOLO v5. The proposed system is a system that provides voice guidance on the names and nutrition information of recognized products. This paper constructs a new dataset that augments the 319 classes of canned/late-night snack product image data using rotate matrix techniques, pepper noise, and salt noise techniques. The proposed system compared and analyzed the performance of YOLO v5n, YOLO v5m, and YOLO v5l models through hyperparameter tuning and learned the dataset built with YOLO v5n models. This paper compares and analyzes the performance of the proposed system with that of previous studies.

Efficient Iris Recognition using Deep-Learning Convolution Neural Network (딥러닝 합성곱 신경망을 이용한 효율적인 홍채인식)

  • Choi, Gwang-Mi;Jeong, Yu-Jeong
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.15 no.3
    • /
    • pp.521-526
    • /
    • 2020
  • This paper presents an improved HOLP neural network that adds 25 average values to a typical HOLP neural network using 25 feature vector values as input values by applying high-order local autocorrelation function, which is excellent for extracting immutable feature values of iris images. Compared with deep learning structures with different types, we compared the recognition rate of iris recognition using Back-Propagation neural network, which shows excellent performance in voice and image field, and synthetic product neural network that integrates feature extractor and classifier.