Search | Korea Science

Maximum Entropy-based Emotion Recognition Model using Individual Average Difference (개인별 평균차를 이용한 최대 엔트로피 기반 감성 인식 모델)

Park, So-Young;Kim, Dong-Keun;Whang, Min-Cheol
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.14 no.7
- /
- pp.1557-1564
- /
- 2010
In this paper, we propose a maximum entropy-based emotion recognition model using the individual average difference of emotional signal, because an emotional signal pattern depends on each individual. In order to accurately recognize a user's emotion, the proposed model utilizes the difference between the average of the input emotional signals and the average of each emotional state's signals(such as positive emotional signals and negative emotional signals), rather than only the given input signal. With the aim of easily constructing the emotion recognition model without the professional knowledge of the emotion recognition, it utilizes a maximum entropy model, one of the best-performed and well-known machine learning techniques. Considering that it is difficult to obtain enough training data based on the numerical value of emotional signal for machine learning, the proposed model substitutes two simple symbols such as +(positive number)/-(negative number) for every average difference value, and calculates the average of emotional signals per second rather than the total emotion response time(10 seconds).
https://doi.org/10.6109/jkiice.2010.14.7.1557 인용 PDF KSCI

Deep Learning-based Speech Voice Separation Training To Enhance STT Performance (STT 성능 향상을 위한 딥러닝 기반 발화 음성 분리학습)

Kim, Bokyoung;Yang, Youngjun;Hwang, Yonghae;Kim, Kyuheon
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2022.06a
- /
- pp.851-853
- /
- 2022
인공지능을 활용한 다양한 딥러닝 기술의 보급과 상용화로 오디오 음성 인식 분야에서도 음성 인식의 정확도를 높이기 위한 다양한 연구가 진행되고 있다. 최근 STT 를 위한 음성 인식 엔진은 딥러닝 기술을 기반으로 과거에 비해 높은 정확도를 보이고 있다. 하지만 예능 프로그램, 드라마, 스포츠 방송 등과 같이 비음성 신호와 음성 신호가 함께 녹음되는 오디오의 경우 음성 인식 정확도가 크게 낮아지는 문제가 발생한다. 이에 본 연구에서는 다양한 장르의 오디오를 음성과 음악을 분리하는 딥러닝 모델을 활용하여 음성 신호와 비음성 신호로 분리하는 방법을 제시하고, STT 결과를 분석하여 음성 인식의 정확도를 높이기 위한 연구 방향을 제시한다.
PDF

A Study on Multipath Effect Mitigation using Trigger Signal in the 3D TDOA Positioning System (3차원 TDOA 위치인식 시스템에서 트리거 신호를 이용한 다중경로 영향 감소에 관한 연구)

Oh, Jongtaek
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.14 no.4
- /
- pp.149-155
- /
- 2014
A study on the indoor positioning system has been active recently, and TDOA technique using acoustic signal has been used generally. The drawback of the TDOA is very weak against signal distortion due to multipath effect. Especially to estimate the smartphone position, the sound distortion is very severe, and the generated radio signal jitter when using WLAN or Bluetooth as a time reference signal makes the receiver difficult to estimate the position. In this paper, acoustic trigger signal for the receiver preparing the positioning signal reception is proposed, and the mitigation of the multipath effect is shown.
https://doi.org/10.7236/JIIBC.2014.14.4.149 인용 PDF KSCI

Sound recognition and tracking system design using robust sound extraction section (주변 배경음에 강인한 구간 검출을 통한 음원 인식 및 위치 추적 시스템 설계)

Kim, Woo-Jun;Kim, Young-Sub;Lee, Gwang-Seok
- The Journal of the Korea institute of electronic communication sciences
- /
- v.11 no.8
- /
- pp.759-766
- /
- 2016
This paper is on a system design of recognizing sound sources and tracing locations from detecting a section of sound sources which is strong in surrounding environmental sounds about sound sources occurring in an abnormal situation by using signals within the section. In detection of the section with strong sound sources, weighted average delta energy of a short section is calculated from audio signals received. After inputting it into a low-pass filter, through comparison of values of the output result, a section strong in background sound is defined. In recognition of sound sources, from data of the detected section, using an HMM(: Hidden Markov Model) as a traditional recognition method, learning and recognition are realized from creating information to recognize sound sources. About signals of sound sources that surrounding background sounds are included, by using energy of existing signals, after detecting the section, compared with the recognition through the HMM, a recognition rate of 3.94% increase is shown. Also, based on the recognition result, location grasping by using TDOA(: Time Delay of Arrival) between signals in the section accords with 97.44% of angles of a real occurrence location.
https://doi.org/10.13067/JKIECS.2016.11.8.759 인용 PDF KSCI

Design and Implementation of A Location Positioning System based on ZigBee Tags in Apartment (ZigBee 태그기반 아파트 위치인식시스템 설계 및 구현)

So, Sun-Sup;Eun, Seong-Bae
- Journal of the Institute of Electronics Engineers of Korea TC
- /
- v.44 no.10
- /
- pp.13-19
- /
- 2007
Location awareness is one of the key functions to build U-city. Recently, many of works for location-aware systems are emerging to commercially apply to on-going large-scale apartment complex. As dwellers or cars being attached with active tags are moving in the U-city complex, the active tags periodically broadcast their own identifiers mu routers fixed along the street or in a building use those information to calculate location of thorn. There are several issues to be considered for such an environment. In this paper we propose i) a new architecture for location-aware system considering such issues ii) technical issues to implement it using active tags, and iii) a mathematical analytic model to investigate overall performance and verify it by comparing with actual experimental results. Through mathematical analysis, we can show that it is more efficient for the routers to send location signals than the tags do. We also show that there are several additional services available in the apartment complex. We conduscted several experiments hi a real ease parking lot to show that our system can locate the location of dwellers or cars.
PDF KSCI

Speech Recognition Performance Improvement using Gamma-tone Feature Extraction Acoustic Model (감마톤 특징 추출 음향 모델을 이용한 음성 인식 성능 향상)

Ahn, Chan-Shik;Choi, Ki-Ho
- Journal of Digital Convergence
- /
- v.11 no.7
- /
- pp.209-214
- /
- 2013
Improve the recognition performance of speech recognition systems as a method for recognizing human listening skills were incorporated into the system. In noisy environments by separating the speech signal and noise, select the desired speech signal. but In terms of practical performance of speech recognition systems are factors. According to recognized environmental changes due to noise speech detection is not accurate and learning model does not match. In this paper, to improve the speech recognition feature extraction using gamma tone and learning model using acoustic model was proposed. The proposed method the feature extraction using auditory scene analysis for human auditory perception was reflected In the process of learning models for recognition. For performance evaluation in noisy environments, -10dB, -5dB noise in the signal was performed to remove 3.12dB, 2.04dB SNR improvement in performance was confirmed.
https://doi.org/10.14400/JDPM.2013.11.7.209 인용 PDF

An Emotion Recognition and Expression Method using Facial Image and Speech Signal (음성 신호와 얼굴 표정을 이용한 감정인식 몇 표현 기법)

Ju, Jong-Tae;Mun, Byeong-Hyeon;Seo, Sang-Uk;Jang, In-Hun;Sim, Gwi-Bo
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 2007.04a
- /
- pp.333-336
- /
- 2007
본 논문에서는 감정인식 분야에서 가장 많이 사용되어지는 음성신호와 얼굴영상을 가지고 4개의(기쁨, 슬픔, 화남, 놀람) 감정으로 인식하고 각각 얻어진 감정인식 결과를 Multi modal 기법을 이용해서 이들의 감정을 융합한다. 이를 위해 얼굴영상을 이용한 감정인식에서는 주성분 분석(Principal Component Analysis)법을 이용해 특징벡터를 추출하고, 음성신호는 언어적 특성을 배재한 acoustic feature를 사용하였으며 이와 같이 추출된 특징들을 각각 신경망에 적용시켜 감정별로 패턴을 분류하였고, 인식된 결과는 감정표현 시스템에 작용하여 감정을 표현하였다.
PDF

Noise Reduction for Korean Connected Digit Recognition through Telephone Channel (전화망 환경에서 한국어 숫자음 인식을 위한 잡음처리)

Kim Kyuhong;Kim Hoirin
- Proceedings of the KSPS conference
- /
- 2003.05a
- /
- pp.211-214
- /
- 2003
일반적으로 음성 인식에서의 성능은 잡음의 영향으로 인하여 저하된다. 전화망을 통한 한국어 연속 숫자음 인식은 음성인식 분야에 있어서 어려운 영역에 속하는데, 이는 조음 현상으로 인한 인식률 저하되는 점과 전화망 채널의 영향으로 인하여 스펙트럼 포락이 왜곡되며 음성신호의 대역폭이 제한되기 때문이다. 본 논문에서는 잡음의 영향을 줄이기 위하여, 2WF(2-stage Wiener Filter) 와 SWP (SNR-dependent Waveform Processing) 그리고 CMN(Cepstrum Mean Normalization)을 사용하였다. 2WF는 음성 신호의 포만트 구조를 적게 왜곡시키면서 전체적인 가산잡음 뿐만 아니라 동적 가산잡음도 줄여준다. SWP는 음성파형에서 SNR값이 상대적으로 큰 부분을 강조하여 전체적인 SNR을 향상시킬 수 있다. 또한, CMN은 특징벡터로부터 채널잡음의 영향을 정규화하여 음성 인식 성능을 향상시킨다. 이러한 방법들을 전화망 한국어 연속 숫자음 DB를 이용하여 실험한 결과, 음성신호의 왜곡을 최소화하면서 잡음의 영향을 줄여 전화망에서의 숫자음 인식 성능을 향상시킬 수 있었다.
PDF

Recognition Algorithm using MFCC Feature Parameter (MFCC 특징 파라미터를 이용한 인식 알고리즘)

Choi, Jae-seung
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2016.10a
- /
- pp.773-774
- /
- 2016
배경잡음은 음성신호의 특징을 왜곡하기 때문에 음성인식 시스템의 인식율 향상의 방해요소가 된다. 따라서 본 논문에서는 배경잡음이 존재하는 환경에서의 음성인식을 실시하기 위해서, 신경회로망과 Mel 주파수 켑스트럼 계수를 사용하여 연속음성 식별 알고리즘을 제안한다. 본 논문의 실험에서는 본 알고리즘을 사용하여 배경잡음이 섞인 음성신호에 대하여 음성인식의 식별율 개선을 실현할 수 있도록 연구를 진행하며, 본 알고리즘이 유효하다는 것을 실험을 통하여 명백히 한다.
PDF

Personal Recognition Method using Coupling Image of ECG Signal (심전도 신호의 커플링 이미지를 이용한 개인 인식 방법)

Kim, Jin Su;Kim, Sung Huck;Pan, Sung Bum
- Smart Media Journal
- /
- v.8 no.3
- /
- pp.62-69
- /
- 2019
Electrocardiogram (ECG) signals cannot be counterfeited and can easily acquire signals from both wrists. In this paper, we propose a method of generating a coupling image using direction information of ECG signals as well as its usage in a personal recognition method. The proposed coupling image is generated by using forward ECG signal and rotated inverse ECG signal based on R-peak, and the generated coupling image shows a unique pattern and brightness. In addition, R-peak data is increased through the ECG signal calculation of the same beat, and it is thus possible to improve the recognition performance of the individual. The generated coupling image extracts characteristics of pattern and brightness by using the proposed convolutional neural network and reduces data size by using multiple pooling layers to improve network speed. The experiment uses public ECG data of 47 people and conducts comparative experiments using five networks with top 5 performance data among the public and the proposed networks. Experimental results show that the recognition performance of the proposed network is the highest with 99.28%, confirming potential of the personal recognition.
https://doi.org/10.30693/SMJ.2019.8.3.62 인용 PDF KSCI

Search Result 1,784, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)