• Title/Summary/Keyword: 음성인식알고리즘

Search Result 449, Processing Time 0.024 seconds

A Smart Refrigerator System based on Internet of Things (IoT 기반 스마트 냉장고 시스템)

  • Kim, Hanjin;Lee, Seunggi;Kim, Won-Tae
    • Journal of IKEEE
    • /
    • v.22 no.1
    • /
    • pp.156-161
    • /
    • 2018
  • Recently, as the population rapidly increases, food shortages and waste are emerging serious problem. In order to solve this problem, various countries and enterprises are trying research and product development such as a study of consumers' purchasing patterns of food and a development of smart refrigerator using IoT technology. However, the smart refrigerators which currently sold have high price issue and another waste due to malfunction and breakage by complicated configurations. In this paper, we proposed a low-cost smart refrigerator system based on IoT for solving the problem and efficient management of ingredients. The system recognizes and registers ingredients through QR code, image recognition, and speech recognition, and can provide various services of the smart refrigerator. In order to improve an accuracy of image recognition, we used a model using a deep learning algorithm and proved that it is possible to register ingredients accurately.

A Study on a Feedback-Centric Piano Education System Using Kinect Sensors (키넥트를 활용한 피드백 중심의 피아노 교육 방안 연구)

  • Park, So Hyun;Ihm, Sun Young;Park, Eun Young;Son, Jong Seo;Park, Young Ho
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.9
    • /
    • pp.403-408
    • /
    • 2015
  • Kinect sensors have the ability to recognize the behavior and voice of the user. Due to its low-cost and high accessibility, Kinect sensors have been used in various fields, including healthcare, education and so on. In this paper, we propose to use Kinect in piano education. Specifically, the proposed method first recognizes the coordinate values of user's posture, compares them with coordinate values of teacher's posture and provide real-time feedbacks to the user. This enables user to keep the correct posture even when he is learning piano without a teacher. However, since the piano education is a long process, it is difficult to achieve the correct posture as a teacher immediately. Thus, we propose a user-oriented method to measure the error tolerance rate. The proposed method is the first feedback based piano education system that uses Kinect sensors.

Design of new CNN structure with internal FC layer (내부 FC층을 갖는 새로운 CNN 구조의 설계)

  • Park, Hee-mun;Park, Sung-chan;Hwang, Kwang-bok;Choi, Young-kiu;Park, Jin-hyun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.05a
    • /
    • pp.466-467
    • /
    • 2018
  • Recently, artificial intelligence has been applied to various fields such as image recognition, image recognition speech recognition, and natural language processing, and interest in Deep Learning technology is increasing. Many researches on Convolutional Neural Network(CNN), which is one of the most representative algorithms among Deep Learning, have strong advantages in image recognition and classification and are widely used in various fields. In this paper, we propose a new network structure that transforms the general CNN structure. A typical CNN structure consists of a convolution layer, ReLU layer, and a pooling layer. Therefore in this paper, We intend to construct a new network by adding fully connected layer inside a general CNN structure. This modification is intended to increase the learning and accuracy of the convoluted image by including the generalization which is an advantage of the neural network.

  • PDF

The Character Recognition System of Mobile Camera Based Image (모바일 이미지 기반의 문자인식 시스템)

  • Park, Young-Hyun;Lee, Hyung-Jin;Baek, Joong-Hwan
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.5
    • /
    • pp.1677-1684
    • /
    • 2010
  • Recently, due to the development of mobile phone and supply of smart phone, many contents have been developed. Especially, since the small-sized cameras are equiped in mobile devices, people are interested in the image based contents development, and it also becomes important part in their practical use. Among them, the character recognition system can be widely used in the applications such as blind people guidance systems, automatic robot navigation systems, automatic video retrieval and indexing systems, automatic text translation systems. Therefore, this paper proposes a system that is able to extract text area from the natural images captured by smart phone camera. The individual characters are recognized and result is output in voice. Text areas are extracted using Adaboost algorithm and individual characters are recognized using error back propagated neural network.

Performance Comparison of Out-Of-Vocabulary Word Rejection Algorithms in Variable Vocabulary Word Recognition (가변어휘 단어 인식에서의 미등록어 거절 알고리즘 성능 비교)

  • 김기태;문광식;김회린;이영직;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.2
    • /
    • pp.27-34
    • /
    • 2001
  • Utterance verification is used in variable vocabulary word recognition to reject the word that does not belong to in-vocabulary word or does not belong to correctly recognized word. Utterance verification is an important technology to design a user-friendly speech recognition system. We propose a new utterance verification algorithm for no-training utterance verification system based on the minimum verification error. First, using PBW (Phonetically Balanced Words) DB (445 words), we create no-training anti-phoneme models which include many PLUs(Phoneme Like Units), so anti-phoneme models have the minimum verification error. Then, for OOV (Out-Of-Vocabulary) rejection, the phoneme-based confidence measure which uses the likelihood between phoneme model (null hypothesis) and anti-phoneme model (alternative hypothesis) is normalized by null hypothesis, so the phoneme-based confidence measure tends to be more robust to OOV rejection. And, the word-based confidence measure which uses the phoneme-based confidence measure has been shown to provide improved detection of near-misses in speech recognition as well as better discrimination between in-vocabularys and OOVs. Using our proposed anti-model and confidence measure, we achieve significant performance improvement; CA (Correctly Accept for In-Vocabulary) is about 89%, and CR (Correctly Reject for OOV) is about 90%, improving about 15-21% in ERR (Error Reduction Rate).

  • PDF

Study on Forearm Muscles and Electrode Placements for CNN based Korean Finger Number Gesture Recognition using sEMG Signals (표면근전도 신호를 활용한 CNN 기반 한국 지화숫자 인식을 위한 아래팔 근육과 전극 위치에 관한 연구)

  • Park, Jong-Jun;Kwon, Chun-Ki
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.8
    • /
    • pp.260-267
    • /
    • 2018
  • Surface electromyography (sEMG) is mainly used as an on/off switch in the early stage of the study and was then expanded to navigational control of powered-wheelchairs and recognition of sign language or finger gestures. There are difficulties in communication between people who know and do not know sign language; therefore, many efforts have been made to recognize sign language or finger gestures. Recently, use of sEMG signals to recognize sign language signals have been investigated; however, most studies of this topic conducted to date have focused on Chinese finger number gestures. Since sign language and finger gestures vary among regions, Korean- and Chinese-finger number gestures differ from each other. Accordingly, the recognition performance of Korean finger number gestures based on sEMG signals can be severely degraded if the same muscles are specified as for Chinese finger number gestures. However, few studies of Korean finger number gestures based on sEMG signals have been conducted. Thus, this study was conducted to identify potential forearm muscles from which to collect sEMG signals for Korean finger number gestures. To accomplish this, six Korean finger number gestures from number zero to five were investigated to determine the usefulness of the proposed muscles and electrode placements by showing that CNN technique based on sEMG signal after sufficient learning recognizes six Korean finger number gestures in accuracy of 100%.

Design and implementation of a 3-axis Motion Sensor based SWAT Hand-signal Motion-recognition System (3축 모션 센서 기반 SWAT 수신호 모션 인식 시스템 설계 및 구현)

  • Yun, June;Pyun, Kihyun
    • Journal of Internet Computing and Services
    • /
    • v.15 no.4
    • /
    • pp.33-42
    • /
    • 2014
  • Hand-signal is an effective communication means in the situation where voice cannot be used for expression especially for soldiers. Vision-based approaches using cameras as input devices are widely suggested in the literature. However, these approaches are not suitable for soldiers that have unseen visions in many cases. in addition, existing special-glove approaches utilize the information of fingers only. Thus, they are still lack for soldiers' hand-signal recognition that involves not only finger motions, but also additional information such as the rotation of a hand. In this paper, we have designed and implemented a new recognition system for six military hand-signal motions, i. e., 'ready', 'move', quick move', 'crawl', 'stop', and 'lying-down'. For this purpose, we have proposed a finger-recognition method and motion-recognition methods. The finger-recognition method discriminate how much each finger is bended, i. e., 'completely flattened', 'slightly flattened', 'slightly bended', and 'completely bended'. The motion-recognition algorithms are based on the characterization of each hand-signal motion in terms of the three axes. Through repetitive experiments, our system have shown 91.2% of correct recognition.

Continuous Speech Recognition based on Parmetric Trajectory Segmental HMM (모수적 궤적 기반의 분절 HMM을 이용한 연속 음성 인식)

  • 윤영선;오영환
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.3
    • /
    • pp.35-44
    • /
    • 2000
  • In this paper, we propose a new trajectory model for characterizing segmental features and their interaction based upon a general framework of hidden Markov models. Each segment, a sequence of vectors, is represented by a trajectory of observed sequences. This trajectory is obtained by applying a new design matrix which includes transitional information on contiguous frames, and is characterized as a polynomial regression function. To apply the trajectory to the segmental HMM, the frame features are replaced with the trajectory of a given segment. We also propose the likelihood of a given segment and the estimation of trajectory parameters. The obervation probability of a given segment is represented as the relation between the segment likelihood and the estimation error of the trajectories. The estimation error of a trajectory is considered as the weight of the likelihood of a given segment in a state. This weight represents the probability of how well the corresponding trajectory characterize the segment. The proposed model can be regarded as a generalization of a conventional HMM and a parametric trajectory model. The experimental results are reported on the TIMIT corpus and performance is show to improve significantly over that of the conventional HMM.

  • PDF

Comparison of Deep Learning Algorithm in Bus Boarding Assistance System for the Visually Impaired using Deep Learning and Traffic Information Open API (딥러닝과 교통정보 Open API를 이용한 시각장애인 버스 탑승 보조 시스템에서 딥러닝 알고리즘 성능 비교)

  • Kim, Tae hong;Yeo, Gil Su;Jeong, Se Jun;Yu, Yun Seop
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.10a
    • /
    • pp.388-390
    • /
    • 2021
  • This paper introduces a system that can help visually impaired people to board a bus using an embedded board with keypad, dot matrix, lidar sensor, NFC reader, a public data portal Open API system, and deep learning algorithm (YOLOv5). The user inputs the desired bus number through the NFC reader and keypad, and then obtains the location and expected arrival time information of the bus through the Open API real-time data through the voice output entered into the system. In addition, by displaying the bus number as the dot matrix, it can help the bus driver to wait for the visually impaired, and at the same time, a deep learning algorithm (YOLOv5) recognizes the bus number that stops in real time and detects the distance to the bus with a distance detection sensor such as lidar sensor.

  • PDF

A Study on the Weight Allocation Method of Humanist Input Value and Multiplex Modality using Tacit Data (암묵 데이터를 활용한 인문학 인풋값과 다중 모달리티의 가중치 할당 방법에 관한 연구)

  • Lee, Won-Tae;Kang, Jang-Mook
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.14 no.4
    • /
    • pp.157-163
    • /
    • 2014
  • User's sensitivity is recognized as a very important parameter for communication between company, government and personnel. Especially in many studies, researchers use voice tone, voice speed, facial expression, moving direction and speed of body, and gestures to recognize the sensitivity. Multiplex modality is more precise than single modality however it has limited recognition rate and overload of data processing according to multi-sensing also an excellent algorithm is needed to deduce the sensing value. That is as each modality has different concept and property, errors might be happened to convert the human sensibility to standard values. To deal with this matter, the sensibility expression modality is needed to be extracted using technologies like analyzing of relational network, understanding of context and digital filter from multiplex modality. In specific situation to recognize the sensibility if the priority modality and other surrounding modalities are processed to implicit values, a robust system can be composed in comparison to the consuming of computer resource. As a result of this paper, it is proposed how to assign the weight of multiplex modality using implicit data.