Search | Korea Science

Implementation of the automatic switching device for the voice communications between heterogeneous devices (이종 기기 간 음성통신을 위한 자동전환장치의 구현)

Lew, Chang-Guk;Lee, Bae-Ho
- The Journal of the Korea institute of electronic communication sciences
- /
- v.10 no.12
- /
- pp.1321-1328
- /
- 2015
A radio is a half-duplex voice communication method using the PTT(: Push To Talk), occupy a single line calls during transmission. As an interface between the telephone and the radio, UHF and VHF, for voice communication between the different heterogeneous devices, A device automatically switches between the two devices is required. Therefore, in accordance with the performance of the voice switching apparatus for detecting a voice to be transmitted from an input signal, loss of the audio signal to be transmitted is subjected to Significant influence. Conventional method has the problem responding to noise by setting the level through simple means of amplitude of input signal, in other words, the energy level of the input signal. This paper, by using the audio signal processing techniques, this discriminated what the voice is among the input signal and substantiated a device for the automatic voice transmission between heterogeneous devices. With this proposal, I was confirmed of improvement of performance in the automatic voice switching device, could perform loss-less transmission of voice between heterogeneous devices.
https://doi.org/10.13067/JKIECS.2015.10.12.1321 인용 PDF KSCI

Auditory Representations for Robust Speech Recognition in Noisy Environments (잡음 환경에서의 음성 인식을 위한 청각 표현)

Kim, Doh-Suk;Lee, Soo-Young;Kil, Rhee-M.
- The Journal of the Acoustical Society of Korea
- /
- v.15 no.5
- /
- pp.90-98
- /
- 1996
An auditory model is proposed for robust speech recognition in noisy environments. The model consists of cochlear bandpass filters and nonlinear stages, and represents frequency and intensity information efficiently even in noisy environments. Frequency information of the signal is obtained by zero-crossing intervals, and intensity information is also incorporated by peak detectors and saturating nonlinearities. Also, the robustness of the zero-crossings in estimating frequency is verified by the developed analytic relationship of the variance of the level-crossing interval perturbations as a function of the crossing level values. The proposed auditory model is computationally efficient and free from many unknown parameters compared with other auditory models. Speaker-independent speech recognition experiments demonstrate the robustness of the proposed method.
PDF

A study of the estimation for sound property in the classroom (강의실에서의 음향특성 평가에 관한 연구)

Lee, Chai-Bong
- Journal of the Institute of Convergence Signal Processing
- /
- v.8 no.1
- /
- pp.32-38
- /
- 2007
In order to establish the environmental condition of sounds in the classroom, we measured the impulse response in cases of using and not-using PA(Public-Address). By calculating the physical index of acoustics, I examined the differences between the two cases. The degree of improvement in listening with the help of PA has also been studied by testing the voice articulation with the use of the measured impulse response. As a result, I found that the clearness is enhanced by increasing the sound pressure level in the case of short reverberation. However, it was not the case when the reverberation time was long.
PDF

A Study on the Automatic Howling Signal Detection Algorithm for Speech Sound Reinforcement (음성 확성을 위한 하울링 신호 자동 검출기법 연구)

Kim, Kyung-Taek;Kim, Dong-Gyu;Roh, Yong-Wan;Hong, Kwang-Seok
- Proceedings of the Korea Institute of Convergence Signal Processing
- /
- 2005.11a
- /
- pp.246-249
- /
- 2005
음향 시스템에 있어서 하울링 현상은 음성 레벨을 제한함으로써 음성의 명료도를 저하시키는 주된 요인이다. 그리고 이를 해결하기 위한 방법으로 하울링 주파수 대역의 게인을 낮추어 음향신호의 피드백을 최소화 하는 것이 일반적이기 때문에 하울링 주파수를 찾아내는 것이 하울링 제어에 있어서 가장 핵심적인 요소가 된다. 그래서 본 논문에서는 하울링 주파수를 자동으로 검출할 수 있는 기법을 제시하였다. 이는 외부로부터 입력된 오디오신호가 하울링 신호 특성을 만족하는 정도를 ‘하울링 지수’라는 파라메터로 정의한 후 이를 기준으로 하울링 발생여부를 판단하고 하울링으로 판별된 신호의 최대 진폭을 갖는 주파수를 하울링 주파수로 출력하는 기법이다. 본 하울링 신호 자동 검출기법의 내용을 검증하기 위하여 하울링 자동 검출 프로그램을 제작하여 실험을 수행한 결과 전체 하울링 신호의 95% 이상을 검출할 수 있었다.
PDF

Time domain Filtering of Image for Lip-reading Enhancement (시간영역 이미지 필터링에 의한 립리딩 성능 향상)

Lee Jeeeun;Kim Jinyoung;Lee Joohun
- Proceedings of the Acoustical Society of Korea Conference
- /
- autumn
- /
- pp.45-48
- /
- 2001
립리딩은 잡음 환경 하에서 음성 인식 성능을 향상을 위해 영상정보를 이용한 바이모달(bimodal)음성인식으로 연구되었다[1][2]. 그 일환으로 이미 영상정보를 이용한 립리딩은 구현되었다. 그러나 현재까지의 시스템들은 환경의 변화에 강인하지 못하다. 본 논문에서는 이미지 기반 립리딩 방법을 적용하여 입술 영역을 보다 안정적으로 찾아 성능을 향상 시켰다. 그러나 이 방법은 많은 데이터량을 처리해야 하므로 전처리 과정이 필요하다. 전처리로 입력영상을 그레이 레벨로 변환하는 방법과, 입술을 반으로 접는 방법, 그리고 주성분 분석(PCA: Principal Component Analysis)을 사용하였다. 또한 인식성능 향상을 위해 음성에서 잡음 제거나 분석$\cdot$합성에 효과적인 성능을 보이는 RASTA(Relative Spectral)필터를 적용하여 시간 영역에서의 변화가 적은 성분이나 급변하는 성분, 그 밖의 잡음 등을 제거하였다. 그 결과 $72.7\%$의 높은 인식 성능을 보였다.
PDF

On a Research of Improving the Performance of Voice Activity Detector in G.723.1 (G.723.1 음성 활동 검출 장치 성능 향상에 관한 연구)

JANG KyungA;KIM JeongJin;Chang YoungOh;HONG SeongHoon;BAE MyungJin
- Proceedings of the Acoustical Society of Korea Conference
- /
- autumn
- /
- pp.53-56
- /
- 1999
ITU-T 국제 표준화 기구에서 인터넷 폰과 화상회의를 목적으로 개발된 G.723.1 음성 부호화기는 잡음 구간에서의 전송률을 낮추기 위한 방법으로 VAD(Voice Activity Detector)와 CNG(Comfort Noise Generator)를 사용하고 있다 이중 VAD는 최종적으로 현재 프레임의 에너지 레벨을 비교하여 음성의 활동 유무를 판정하고 있다. 하지만 G.723.1 VAD에서는 보다 안정적인 판정을 위해 음성 활동 구간 사이에 삽입되어 있는 묵음 구간에 대해서는 거의 대부분 음성이 활동하는 영역으로 판정을 하고 있다. 따라서 본 논문에서는 묵음 구간에 대해 보다 정확한 판정을 통하여 기존의 방법에 비해 전송율을 더욱 감소시킬 수 있는 방법을 제안한다. 실험에서는 묵음구간을 길게 조절한 문장을 사용하여 측정한 결과 평균 $46.8\%$ 정도의 전송율을 감소시킬 수 있었으며, 주관적인 음질평가의 경우 음질의 열하는 거의 발생하지 않았다.
PDF

Improvement of VAD Performance for the Reduction of the Bit Rate Under the Noise Environment in the G.723.1 (잡음 환경에서의 전송률 감소를 위한 G.723.1 음성활동 검출기 성능 개선에 관한 연구)

김정진;장경아;배명진
- The Journal of the Acoustical Society of Korea
- /
- v.20 no.3
- /
- pp.42-47
- /
- 2001
This paper improves the performance of VAD (Voice Activity Detector) in G.723.1 Annex A 6.3kbps/5.3kbps dual rate speech coder, which is developed for Internet Phone and videoconferencing. The VAD decision is based on a three-level energy threshold. We evaluates for processing time, speech quality, and bit rate. The processing time is reduced due to the accuracy of VAD decision on the silence period. On subjective quality test there is almost no difference compared with the G.723.1. In order to measure the bit rate we count the active speech frame (VAD=1) and we can reduce more bit rate as silence periods are shown.
PDF

A Study on Numeral Speech Recognition Using Integration of Speech and Visual Parameters under Noisy Environments (잡음환경에서 음성-영상 정보의 통합 처리를 사용한 숫자음 인식에 관한 연구)

Lee, Sang-Won;Park, In-Jung
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.38 no.3
- /
- pp.61-67
- /
- 2001
In this paper, a method that apply LP algorithm to image for speech recognition is suggested, using both speech and image information for recogniton of korean numeral speech. The input speech signal is pre-emphasized with parameter value 0.95, analyzed for B th LP coefficients using Hamming window, autocorrelation and Levinson-Durbin algorithm. Also, a gray image signal is analyzed for 2-dimensional LP coefficients using autocorrelation and Levinson-Durbin algorithm like speech. These parameters are used for input parameters of neural network using back-propagation algorithm. The recognition experiment was carried out at each noise level, three numeral speechs, '3','5', and '9' were enhanced. Thus, in case of recognizing speech with 2-dimensional LP parameters, it results in a high recognition rate, a low parameter size, and a simple algorithm with no additional feature extraction algorithm.
PDF

Comparison of acoustics performance measurement and evaluation standard of office space and office acoustics criteria of European countries (사무공간의 음향성능 측정, 평가 방법의 표준화와 유럽 국가들의 음향성능 기준 비교)

Jeong-Ho Jeong
- The Journal of the Acoustical Society of Korea
- /
- v.42 no.2
- /
- pp.133-142
- /
- 2023
The office environment is changing according to work types, Information Technology (IT) advancements, and the Coronavirus disease (COVID)-19 situation. In order for office space users to perform their tasks comfortably and efficiently, it is necessary to secure individual privacy as well as easy communication among members. In Korea, the demand for improving the acoustic performance of office spaces is also increasing, but the related performance criteria and guidelines have not been established. In this study, standardization of office space acoustic performance measurement and evaluation methods and European countries' acoustic performance criteria were compared and reviewed. It is proposed to comprehensively review international standardization trends and acoustic performance standards in each country and to establish and utilize criteria for evaluating the acoustic performance and satisfaction of office spaces in Korea through our survey. Considering the international standardization direction and compatibility with communication and Public Address (PA) systems, it is appropriate to establish criteria using the speech transmission index or Speech Transmission Index (STI) application index. This criterion will be highly utilizable and compatible. In addition, since the office furniture industry is interested in improving the acoustic performance of office space, it is necessary to establish a labelling system for speech level reduction of office furniture.
https://doi.org/10.7776/ASK.2023.42.2.133 인용 PDF

Korean Digit Recognition Under Noise Environment Using Spectral Mapping Training (스펙트럼사상학습을 이용한 잡음환경에서의 한국어숫자음인식)

Lee, Ki-Young
- The Journal of the Acoustical Society of Korea
- /
- v.13 no.3
- /
- pp.25-32
- /
- 1994
This paper presents the Korean digit recognition method under noise environment using the spectral mapping training based on static supervised adaptation algorithm. In the presented recognition method, as a result of spectral mapping from one space of noisy speech spectrum to another space of speech spectrum without noise, spectral distortion of noisy speech is improved, and the recognition rate is higher than that of the conventional method using VQ (vector quatization) and DTW(dynamic time warping) without noise processing, and even when SNR level is 0dB, the recognition rate is 10 times of that using the conventional method. It has been confirmed that the spectral mapping training has an ability to improve the recognition performance for speech in noise environment.
PDF

Search Result 138, Processing Time 0.021 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)