Search | Korea Science

Development of Automative Loudness Control Technique based on Audio Contents Analysis using Deep Learning (딥러닝을 이용한 오디오 콘텐츠 분석 기반의 자동 음량 제어 기술 개발)

Lee, Young Han;Cho, Choongsang;Kim, Je Woo
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2018.11a
- /
- pp.42-43
- /
- 2018
국내 디지털 방송 프로그램은 2016년 방송법 개정 이후, ITU-R / EBU에서 제안한 측정 방식을 활용하여 채널 및 프로그램 간의 음량을 맞추어 제공되고 있다. 일반적으로 뉴스나 중계와 같이 실시간으로 음량을 맞춰야 하는 분야를 제외하고는 평균 음량을 규정에 맞춰 송출하고 있다. 본 논문에서는 일괄적으로 평균 음량을 맞출 경우 발생하는 저음량의 명료도를 높이기 위한 기술을 제안한다. 즉, 방송 음량을 조절하는 기술 중의 하나로 오디오 콘텐츠를 분석하여 구간별 음량 조절 정도를 달리함으로써 저음량에서의 음성은 상대적으로 높은 음량을 가지고 배경음악 등을 상대적으로 낮음 음량을 가지도록 생성함으로써 명료도를 높이는 방식을 제안한다. 제안한 방식의 성능을 확인하기 위해 오디오 콘텐츠 분석 정확도 측정과 오디오 파형 분석을 실시하였으며 이를 통해 기존의 음량 제어 기술과 비교하여 음성 구간에 대해 음량을 증폭시키는 것을 확인하였다.
PDF

A Study on the to Shorten of Early Decay Time in the Reverberation Curve Using MINT (MINT법을 이용한 실내 잔향곡선의 초기감쇠시간 단축에 관한 연구)

차경환
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.1
- /
- pp.37-41
- /
- 2002
In this paper, we made shorter EDT(early decay time) of room reverberation curve using multiple-channel. The speech signal was processed inverse filtering with full-band and sub-band in the basis MINT, and then the multiple-channel adaptive filters were used LMS (Least Mean Square) and NLMS (Normalized Least Mean Square) algorithm. Experimental results, we could get 1/3 of time reduction at 20dB level in the reverberation curve using full-band NLMS when two microphones were used. Also, it is shown that the speech articulation was improved 80% from the test listeners with the speech, which was to shorten EDT by MINT in the subjective assessments using real room impulse response.
PDF KSCI

Comparison of the Korean and Chinese Speech Intelligibility with Increasing Sound Absorption in a Classroom (강의실의 실내흡음력 증가에 따른 한국어 및 중국어의 음성요해도 비교)

Ding, Wei;Park, Chan-Jae;Haan, Chan-Hoon
- The Journal of the Acoustical Society of Korea
- /
- v.31 no.3
- /
- pp.129-141
- /
- 2012
The present study aims to investigates the effects of the physical sound clarity (D50, STI) on the subjective speech intelligibility of the both Korean and Chinese languages which can be caused by increase of the sound absorption in classroom. In order to this, sound measurements were undertaken at a classroom with and without absorption materials. Also, speech intelligibility tests were conducted by Korean and Chinese students using their native languages. As the results, it was found that both sound clarity and speech intelligibility were improved with increasing sound absorption. Also, it was revealed that Chinese speech intelligibility was more improved than Korean with same impose of sound absorption. It was considered by the difference of phonetic characteristics of two languages. Analysing correlation of physical sound clarity and subjective speech intelligibility, it was shown that D50 is highly correlated with Korean (0.696) and Chinese (0.707) respectively. Also, STI was highly correlated with Korean (0.651) and Chinese (0.665). Thus, it can be concluded that D50 and STI have significant correlations with the speech intelligibility.
https://doi.org/10.7776/ASK.2012.31.3.129 인용 PDF KSCI

Low-cost implementation of text to speech(TTS) system for car navigation (Car Navigation용 음성합성시스템 최저가 구현)

Na Ji Hoon;Sung Jung Mo;Yang Yoon Gi
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.141-144
- /
- 2000
최근에 무선통신망을 이용한 데이터 서비스가 폭넓게 제공되면서, 이동체(MS:mobile station)에 대한 위치정보나 교통상황 둥의 부가 정보 서비스가 제공되고 있다. 이와 같이 이동체가 자동차와 같은 운행수단일 때 사용자가 디스플레이 되는 문자정보를 확인하게 되면 운전의 안정성이 저하되어 실용적이지 못하다. 이를 위해서 문자를 음성으로 전환하여 주는 문자-음성변환기(text to speech : TTS)가 필요하다. 본 논문은 car navigation용 '한국어 무제한 어휘 음성합성기' 를 저가의 DSP chip(ADSP-2185)과 저용량의 4M bits ROM을 사용하여 low-cost system으로 하드웨어를 구성하였다. 본 연구에서 개발된 실시간 한국어 음성 합성기는 저가의 통신 단말기로서 사용 될 수 있으나, 반음절 연결부분의 연결이 불완전한 경우가 많았다. 그러나 종성이 없는 음절에 대해서는 명료도가 비교적 우수하였다.
PDF

The Literature Review of Speech Intelligibility in Congenitally Deafened Children with Cochlear Implantation (선천성 청각장애 아동의 와우이식 후 말 명료도에 관한 문헌 고찰)

Yoon Misun
- MALSORI
- /
- no.47
- /
- pp.141-151
- /
- 2003
The speech intelligibility of congenitally deafened children shows the change after cochlear implantation. The predicting factors of change in speech intelligibility are the age of implantation, the duration of implant use, and communication mode etc.. Among these factors, the age of implantation seems to be one of the most important predictors. But those factors including age of implantation can explain only some parts of the variance. Therefore, the further study to find the factors which affect the speech intelligibility should be done.
PDF

Development of an Electrolarynx Controlled by EMG (근전위 제어형 전기 인공후두의 시작)

민혜정;봉정표;최홍식;윤형로
- Proceedings of the KSLP Conference
- /
- 1996.11a
- /
- pp.91-91
- /
- 1996
현재 시판되고 있는 전기 인공후두는 손으로 인공후두를 경부에 장착하고, 음의 intensity와 pitch를 변하기 위하여 스위치를 손가락으로 조절해야만 하는데, 실제 회화 중에 잘 조절한다는 것은 거의 불가능하므로, 음질과 명료도가 나쁘며, 발성을 의도했을 때 자유롭게 발성하는 것도 어렵다. 또한 회화 중에는 한손은 항상 전기 인공후두를 위해 사용해야 한다. 이러한 단점을 개선하기 위해, 본 연구에서는 흉골설골근 근전위에 의해 제어되는 인공후두를 제작하여 그 성능을 평가하였다. (중략)
PDF

On the Development of Monosyllable Lists for Articulation Tests (명료도 평가용 단음절 목록의 개발)

Kim, Jeong-Hwan;Kang, Seong-Hoon;Jang, Dae-Young;Kim, Cheon-Duck
- The Journal of the Acoustical Society of Korea
- /
- v.13 no.4
- /
- pp.69-76
- /
- 1994
In this study we developed monosyllable lists for articulation test for Korean. We sampled 103, 581 colloquial monosyllables, applied them to five selection rules that based on Korean linguistic characterisitcs, and finally constructed five different lists with fifity monosyllables. The validity test using the monaural impairment factors such as S/N ratio and cut-off frequency showed that articulation scores were changed systematically according to the level of impairment factors. In addition, we investigated the effect of azimuth of a single competing sound source upon articulation scores. The syllables were always reproduced by the loudspeaker in front of the subject, while Hoth noise( -5dB/oct) were reproduced by the loudspeaker with varying azimuth around subject. The result indicated that the articulation depended on the azimuth of competing sound sources and no significant differences among lists were found in all experimental conditions.
PDF

Effects of breathing training in melodic intonation therapy on articulation intelligibility of aphasics: pilot study (멜로디 억양 치료에서 실어증 환자의 조음 명료도에 대한 호흡 훈련 효과: 초기 실험)

Kim, Seon Sik;Hong, Geum Na;Choi, Min Joo
- The Journal of the Acoustical Society of Korea
- /
- v.35 no.4
- /
- pp.319-329
- /
- 2016
The present study was to test if breathing training in melodic intonation therapy (MIT) ameliorated the articulation intelligibility of Broca's aphasics or not. The experimental group did breathing training (2 stages) that preceded the MIT. In order to evaluate the efficacy of the MIT intervention, the VOT (Voice Onset Time), the TD (Total Delay), the voice sound intensity and the expiratory volume of the subjects, closely associated with articulation intelligibility were measured before and after the intervention. It was shown that, in the experimental group after the MIT intervention, the VOT and TD were increased on bilabial/p/, alveolar consonant /t/, and soft palatal /k/(p < 0.05), but no significant differences were found on affricate /c/ and fricative /s/(p > 0.05). In the control group, no significant increases in the VOT and TD were observed on all articulation points(p > 0.05). The voice sound intensity which influences the verbal articulation increased in the experimental group after the intervention(p < 0.05), whereas no significant changes were observed in the control group. In conclusion, the breathing training in the MIT was found to result in improving the articulation intelligibility of Broca's aphasiacs.
https://doi.org/10.7776/ASK.2016.35.4.319 인용 PDF KSCI

A Post-processing for Binary Mask Estimation Toward Improving Speech Intelligibility in Noise (잡음환경 음성명료도 향상을 위한 이진 마스크 추정 후처리 알고리즘)

Kim, Gibak
- Journal of Broadcast Engineering
- /
- v.18 no.2
- /
- pp.311-318
- /
- 2013
This paper deals with a noise reduction algorithm which uses the binary masking in the time-frequency domain. To improve speech intelligibility in noise, noise-masked speech is decomposed into time-frequency units and mask "0" is assigned to masker-dominant region removing time-frequency units where noise is dominant compared to speech. In the previous research, Gaussian mixture models were used to classify the speech-dominant region and noise-dominant region which correspond to mask "1" and mask "0", respectively. In each frequency band, data were collected and trained to build the Gaussian mixture models and detection procedure is performed to the test data where each time-frequency unit belongs to speech-dominant region or noise-dominant region. In this paper, we consider the correlation of masks in the frequency domain and propose a post-processing method which exploits the Viterbi algorithm.
https://doi.org/10.5909/JBE.2013.18.2.311 인용 PDF KSCI

Comparison of Speech Intelligibility & Performance of Speech Recognition in Real Driving Environments (자동차 주행 환경에서의 음성 전달 명료도와 음성 인식 성능 비교)

Lee Kwang-Hyun;Choi Dae-Lim;Kim Young-Il;Kim Bong-Wan;Lee Yong-Ju
- MALSORI
- /
- no.50
- /
- pp.99-110
- /
- 2004
The normal transmission characteristics of sound are hardly obtained due to the various noises and structural factors in a running car environment. It is due to the channel distortion of the original source sound recorded by microphones, and it seriously degrades the performance of the speech recognition in real driving environments. In this paper we analyze the degree of intelligibility under the various sound distortion environments by channels according to driving speed with respect to speech transmission index(STI) and compare the STI with rates of speech recognition. We examine the correlation between measures of intelligibility depending on sound pick-up patterns and performance in speech recognition. Thereby we consider the optimal location of a microphone in single channel environment. In experimentation we find that high correlation is obtained between STI and rates of speech recognition.
PDF

Search Result 187, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)