통합 검색 | Korea Science

Robust Audio Fingerprinting Method Using Prominent Peak Pair Based on Modulated Complex Lapped Transform

Kim, Hyoung-Gook;Kim, Jin Young
- ETRI Journal
- /
- 제36권6호
- /
- pp.999-1007
- /
- 2014
The robustness of an audio fingerprinting system in an actual noisy environment is a major challenge for audio-based content identification. This paper proposes a high-performance audio fingerprint extraction method for use in portable consumer devices. In the proposed method, a salient audio peak-pair fingerprint, based on a modulated complex lapped transform, improves the accuracy of the audio fingerprinting system in actual noisy environments with low computational complexity. Experimental results confirm that the proposed method is quite robust in different noise conditions and achieves promising preliminary accuracy results.
https://doi.org/10.4218/etrij.14.0113.1405 인용 PDF KSCI KPUBS

멀티밴드 스펙트럼 차감법과 엔트로피 하모닉을 이용한 잡음환경에 강인한 분산음성인식 (Robust Distributed Speech Recognition under noise environment using MESS and EH-VAD)

최갑근;김순협
- 전자공학회논문지CI
- /
- 제48권1호
- /
- pp.101-107
- /
- 2011
음성인식의 실용화에 가장 저해되는 요소는 배경잡음과 채널에 의한 왜곡이다. 일반적으로 잡음은 음성인식 시스템의 성능을 저하시키고 이로 인해 사용 장소의 제약을 많이 받고 있다. DSR(Distributed Speech Recognition) 기반의 음성인식 역시 이 같은 문제로 성능 향상에 어려움을 겪고 있다. 이 논문은 잡음환경에서 DSR기반의 음성인식률 향상을 위해 정확한 음성구간을 검출하고, 잡음을 제거하여 잡음에 강인한 특징추출을 하도록 설계하였다. 제안된 방법은 엔트로피와 음성의 하모닉을 이용해 음성구간을 검출하며 멀티밴드 스펙트럼 차감법을 이용하여 잡음을 제거한다. 음성의 스펙트럼 에너지에 대한 엔트로피를 사용하여 음성검출을 하게 되면 비교적 높은 SNR 환경 (SNR 15dB) 에서는 성능이 우수하나 잡음환경의 변화에 따라 음성과 비음성의 문턱 값이 변화하여 낮은 SNR환경(SNR 0dB)에시는 정확한 음성 검출이 어렵다. 이 논문은 낮은 SNR 환경(0dB)에서도 정확한 음성을 검출할 수 있도록 음성의 스펙트럴 엔트로피와 하모닉 성분을 이용하였으며 정확한 음성 구간 검출에 따라 잡음을 제거하여 잡음에 강인한 특정을 추출하도록 하였다. 실험결과 잡음환경에 따른 인식조건에서 개선된 인식성능을 보였다.
PDF KSCI

차량에서의 음성인식율 향상을 위한 전처리 기법 (Preprocessing Technique for Improvement of Speech Recognition in a Car)

김현태;박장식
- 한국콘텐츠학회논문지
- /
- 제9권1호
- /
- pp.139-146
- /
- 2009
본 논문에서는 차량에서의 자동 음성인식 시스템과 같이 신호대잡음비가 낮은 잡음 환경에서의 음성인식에 적합한 변형된 스펙트럼 차감법을 제안한다. 기존의 스펙트럼 차감법은 스펙트럼에서 낮은 신호대 잡음비(SNR)를 갖는 부분은 감쇄되고, 신호대잡음비가 높은 부분은 강조되는 신호대잡음비에 의존한다. 그러나 이러한 구성은 높은 신호대잡음비를 갖는 환경에서는 적절하나 차량 환경과 같이 낮은 신호대잡음비를 나타내는 환경에서는 매우 부적절하다. 제안하는 방법은 낮은 신호대잡음비를 갖는 잡음 환경을 위해 음성우세영역을 강조하여 불필요하게 음성영역이 과차감되지 않도록 방지한다. 차량용 음성명령어 어휘를 대상으로 한 실험 결과에서 제안하는 방법이 기존의 방법에 비해 우수한 것을 확인하였다.
https://doi.org/10.5392/JKCA.2009.9.1.139 인용 PDF

초음파 센서를 이용한 AGV의 주행 환경 인식과 간단한 벽면 따르기 알고리즘 (Driving Environment Recognition and a Simple Wall-Following Algorithm for AGV Using Sonar Sensor)

김성중;이정웅;이창구
- 대한전기학회:학술대회논문집
- /
- 대한전기학회 2002년도 하계학술대회 논문집 D
- /
- pp.2337-2340
- /
- 2002
This paper presents the method of AGV(Automatic guided vehicle)'s moving environment(plane, corner, edge) recognition using SONAR sensor configuration. As for the SONAR sensor, the Crosstalk effect has been generally considered as an inevitable noisy phenomenon in the indoor environment. However, this effect can be used as a clue for classifying and localizing targets in the indoor environment if those can be controlled and used well. EERUF(error eliminate rapid ultrasonic firing) is a method for firing multiple ultrasonic sensors in mobile robot application and multi-echo mode of POLARIOD Device can reduce the Crosstalk effect. Here, Crosstalk effect was reduced using EERUF and applied to the AGV with a simple wall-following algorithm in the indoor environment. This method was tesed by a typical AGV with multi SONAR sensors in the laboratory environment.
PDF

반향음과 잡음 환경을 고려한 실시간 소리 추적 시스템 (Real-Time Sound Localization System For Reverberant And Noisy Environment)

기창돈;김강호;이택진
- 한국항공우주학회지
- /
- 제38권3호
- /
- pp.258-263
- /
- 2010
소리를 이용한 위치 추적은 마이크로폰을 이용하여 신호를 수집하고 수집된 신호로 부터 마이크로폰 간의 신호 도달 시간차를 추정한 뒤 추정된 시간차를 이용하여 소리의 발생 위치를 추정하는 과정을 거치게 된다. 실내 환경에서 이를 활용하기 위해서는 잡음과 반향음에 대한 강건성을 확보해야만 하는 제약이 따른다. 특히 실시간으로 구현하기 위해서는 계산의 효율성까지 고려되어야 한다. 본 논문에서는 네 개의 저가 콘덴서 마이크로폰을 이용하여 비용적인 측면과 계산량에서의 효율성을 모두 추구하였다. 네 개의 마이크로폰을 이용하여 마이크로폰 간의 소리 도달 시간차를 구하는 계산량을 줄였고 GCC-PHAT(Generalized Cross Correlation-Phase Transform) 알고리즘을 이용해서 강건성을 높였으며 iterative least square 방식을 이용하여 높은 정확도의 위치 데이터를 얻을 수 있었다.
https://doi.org/10.5139/JKSAS.2010.38.3.258 인용 PDF KSCI

LTE 시스템 채널 추정치의 후처리 기법 연구 (A Study on the Postprocessing of Channel Estimates in LTE System)

유경렬
- 전기학회논문지
- /
- 제60권1호
- /
- pp.205-213
- /
- 2011
The Long Term Evolution (LTE) system is designed to provide a high quality data service for fast moving mobile users. It is based on the Orthogonal Frequency Division Multiplexing (OFDM) and relies its channel estimation on the training samples which are systematically built within the transmitting data. Either a preamble or a lattice type is used for the distribution of training samples and the latter suits better for the multipath fading channel environment whose channel frequency response (CFR) fluctuates rapidly with time. In the lattice-type structure, the estimation of the CFR makes use of the least squares estimate (LSE) for each pilot samples, followed by an interpolation both in time-and in frequency-domain to fill up the channel estimates for subcarriers corresponding to data samples. All interpolation schemes should rely on the pilot estimates only, and thus, their performances are bounded by the quality of pilot estimates. However, the additive noise give rise to high fluctuation on the pilot estimates, especially in a communication environment with low signal-to-noise ratio. These high fluctuations could be monitored in the alternating high values of the first forward differences (FFD) between pilot estimates. In this paper, we analyzed statistically those FFD values and propose a postprocessing algorithm to suppress high fluctuations in the noisy pilot estimates. The proposed method is based on a localized adaptive moving-average filtering. The performance of the proposed technique is verified on a multipath environment suggested on a 3GPP LTE specification. It is shown that the mean-squared error (MSE) between the actual CFR and pilot estimates could be reduced up to 68% from the noisy pilot estimates.
https://doi.org/10.5370/KIEE.2011.60.1.205 인용 PDF KSCI

DAECNN 기반의 병원처방전 이미지잡음제거 (Image Denoising Methods based on DAECNN for Medication Prescriptions)

홍고르출;이상무;김용기;김미혜
- 한국융합학회논문지
- /
- 제10권5호
- /
- pp.17-26
- /
- 2019
본 연구는 환자의 알레르기 예방시스템을 구축하기 위해 스마트폰을 이용하여 저장된 처방전의 이미지잡음제거를 위한 ROI 추출 방법에 중점을 두었다. 현재 ROI 추출은 제한된 실험 환경에서 좋은 성능을 보여 주었지만 실제 환경에서의 성능은 잡음으로 인해 좋지 않았다. 따라서 본 연구에서는 정확도 높은 ROI 추출을 위해 스마트폰 영상에서 발생하는 잡음제거 방법을 제안한다. SMF, DIN, DAE, DAECNN(Denoising Autoencoder with Convolution Neural Network) and median filter with DAECNN(MF+DAECNN) 방법을 실험하였고 그 결과 DAECNN 및 MF + DAECNN 방법이 스마트폰에서 이미지의 잡음제거가 효과적임을 보여주었다. 성능 향상을 검증하기 위해 SSIM, PSNR 및 MSE 방법을 사용하였고 이 시스템은 OpenCV, C ++ 및 Python로 구현 및 실험되었고 실제 이미지에서 성능 테스트를 거쳐 자연잡음(natural noise)을 제거하는데 본 논문에서 제안한 DAECNN과 MF+DAECNN이 각 69%로 기존의 DAE 방법 55% 보다 상대적으로 높은 결과를 도출하였다.
https://doi.org/10.15207/JKCS.2019.10.5.017 인용 PDF KSCI HTML

A Personal Sound Amplification Product Compared to a Basic Hearing Aid for Speech Intelligibility in Adults with Mild-to-Moderate Sensorineural Hearing Loss

Choi, Ji Eun;Kim, Jinryoul;Yoon, Sung Hoon;Hong, Sung Hwa;Moon, Il Joon
- Journal of Audiology & Otology
- /
- 제24권2호
- /
- pp.91-98
- /
- 2020
Background and Objectives: This study aimed to compare functional hearing with the use of a personal sound amplification product (PSAP) or a basic hearing aid (HA) among sensorineural hearing impaired listeners. Subjects and Methods: Nineteen participants with mild-to-moderate sensorineural hearing loss (SNHL) (26-55 dB HL; pure-tone average, 0.5-4 kHz) were prospectively included. No participants had prior experience with HAs or PSAPs. Audiograms, speech intelligibility in both quiet and noisy environments, speech quality, and preference were assessed in three different listening conditions: unaided, with the HA, and with the PSAP. Results: The use of PSAP was associated with significant improvement in pure-tone thresholds at 1, 2, and 4 kHz compared to the unaided condition (all p<0.01). In the quiet environment, speech intelligibility was significantly improved after wearing a PSAP compared to the unaided condition (p<0.001), and this improvement was better than the result obtained with the HA. The PSAP also demonstrated similar improvement in the most comfortable levels compared to those obtained with the HA (p<0.05). However, there was no significant improvement of speech intelligibility in a noisy environment when wearing the PSAP (p=0.160). There was no significant difference in the reported speech quality produced by either device or in participant preference for the PSAP or HA. Conclusions: The current result suggests that PSAPs provide considerable benefits to speech intelligibility in a quiet environment and can be a good alternative to compensate for mild-to-moderate SNHL.
https://doi.org/10.7874/jao.2019.00367 인용

A Personal Sound Amplification Product Compared to a Basic Hearing Aid for Speech Intelligibility in Adults with Mild-to-Moderate Sensorineural Hearing Loss

Choi, Ji Eun;Kim, Jinryoul;Yoon, Sung Hoon;Hong, Sung Hwa;Moon, Il Joon
- 대한청각학회지
- /
- 제24권2호
- /
- pp.91-98
- /
- 2020
Background and Objectives: This study aimed to compare functional hearing with the use of a personal sound amplification product (PSAP) or a basic hearing aid (HA) among sensorineural hearing impaired listeners. Subjects and Methods: Nineteen participants with mild-to-moderate sensorineural hearing loss (SNHL) (26-55 dB HL; pure-tone average, 0.5-4 kHz) were prospectively included. No participants had prior experience with HAs or PSAPs. Audiograms, speech intelligibility in both quiet and noisy environments, speech quality, and preference were assessed in three different listening conditions: unaided, with the HA, and with the PSAP. Results: The use of PSAP was associated with significant improvement in pure-tone thresholds at 1, 2, and 4 kHz compared to the unaided condition (all p<0.01). In the quiet environment, speech intelligibility was significantly improved after wearing a PSAP compared to the unaided condition (p<0.001), and this improvement was better than the result obtained with the HA. The PSAP also demonstrated similar improvement in the most comfortable levels compared to those obtained with the HA (p<0.05). However, there was no significant improvement of speech intelligibility in a noisy environment when wearing the PSAP (p=0.160). There was no significant difference in the reported speech quality produced by either device or in participant preference for the PSAP or HA. Conclusions: The current result suggests that PSAPs provide considerable benefits to speech intelligibility in a quiet environment and can be a good alternative to compensate for mild-to-moderate SNHL.
https://doi.org/10.7874/jao.2019.00367 인용

Spectral Subtraction Using Spectral Harmonics for Robust Speech Recognition in Car Environments

Beh, Jounghoon;Ko, Hanseok
- The Journal of the Acoustical Society of Korea
- /
- 제22권2E호
- /
- pp.62-68
- /
- 2003
This paper addresses a novel noise-compensation scheme to solve the mismatch problem between training and testing condition for the automatic speech recognition (ASR) system, specifically in car environment. The conventional spectral subtraction schemes rely on the signal-to-noise ratio (SNR) such that attenuation is imposed on that part of the spectrum that appears to have low SNR, and accentuation is made on that part of high SNR. However, these schemes are based on the postulation that the power spectrum of noise is in general at the lower level in magnitude than that of speech. Therefore, while such postulation is adequate for high SNR environment, it is grossly inadequate for low SNR scenarios such as that of car environment. This paper proposes an efficient spectral subtraction scheme focused specifically to low SNR noisy environment by extracting harmonics distinctively in speech spectrum. Representative experiments confirm the superior performance of the proposed method over conventional methods. The experiments are conducted using car noise-corrupted utterances of Aurora2 corpus.
PDF KSCI

검색결과 389건 처리시간 0.022초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)