Search | Korea Science

A Study on Searching proof of character in voice (목소리에 의한 성격규명에 관한 연구)

서지호;배명진
- Proceedings of the Korean Society for Emotion and Sensibility Conference
- /
- 2003.11a
- /
- pp.131-132
- /
- 2003
사람의 음성이 나오기까지 화자가 전달하고자 하는 생각이 언어학적 구조로 바뀌고 이 과정에서 생각을 나타내는 적절한 단어나 구가 선택된다. 또 특정언어의 문법규칙에 의해 어순을 배열하고, 전체 의미에서 중요한 면을 강조하기 위해 피치ⅰ), 억양이나 강세와 같은 특성들을 첨가하는 등의 처리 절차를 통하게 된다. 음성은 기본적으로 여기ⅱ) 성분과 성도ⅲ) 성분으로 구분할 수 있다. 성도는 인두강과 구강을 합쳐서 일컫는다. 따라서 입 모양을 어떻게 하느냐에 따라서도 같은 말이라도 명료성에 영향을 미치게 되고 이러한 특성은 자신감이 넘치고 외향적인 모습으로 비춰지게 된다. 본 논문에서는 입의 모양에 따른 음성의 특징과 발성습관을 통해서 나타나는 사람의 성격을 알아보았다.
PDF

무선수중전화기의 통화품질 평가

박문갑;김석재;윤종락
- Proceedings of the Korean Society of Fisheries Technology Conference
- /
- 2001.10a
- /
- pp.69-70
- /
- 2001
음성 통신계를 통해서 전달되는 음성신호 품질의 좋고 나쁨을 수화자의 청각을 거친 반응(주관평가)에 의해 객관적으로 표현한 것을 통화품질이라 한다. 잠수기 어업 등과 같은 수중 활동에 사용할 무선수중전화기 시작품 설계와 최적의 통화품질 평가 방법 선점의 기초 자료로써 레저스포츠등에 인부 사용되고 있는 기존의 무전수중전화기로 수중 음성통신에서의 명료도와 단어 요해도를 측정하였다. (중략)
PDF

CycleGAN for Enhancement of Degraded Speech by Face Mask (마스크 착용에 의해 왜곡된 음성의 품질 향상을 위한 CycleGAN 기술)

Lim, Yujin;Yu, Jeongchan;Seo, Eunmi;Park, Hochong
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2022.06a
- /
- pp.63-64
- /
- 2022
마스크 착용은 대화나 통화 등의 의사소통에 불편함을 초래하고 음성의 품질과 명료도를 떨어트린다. 이를 해결하기 위해 음성 향상 기술이 필요하며, 머신러닝 기반의 다양한 음성 향상 방법이 개발되었다. 지도 학습을 위해 마스크 착용 유무에 따라 일대일로 대응된 음성 데이터를 확보하는 것은 매우 어렵고, 따라서 일대일로 대응된 데이터가 필수적이지 않은 비지도 학습이 요구된다. 본 논문에서는 비지도 학습방식을 사용하면서 콘텍스트를 유지하며 특징을 변경할 수 있는 CycleGAN을 이용하여 마스크 착용에 의한 음성 왜곡을 복원 시키는 기술을 제안한다. 스펙트로그램 기반으로 마스크 착용에 의해 왜곡된 음성을 마스크 미착용 음성으로 변환하여 음성의 품질을 향상시켰다. 청취평가를 진행한 결과 품질이 향상된 음원의 선호도가 더 높음을 확인하였으며 스펙트로그램을 통해 3 kHz 이상의 고대역 에너지가 증가하는 것을 확인하였다. 이를 통해 CycleGAN을 이용한 비지도 학습으로 마스크 착용에 의해 왜곡된 음성의 품질을 향상시킬 수 있음을 확인하였다.
PDF

A Post-processing for Binary Mask Estimation Toward Improving Speech Intelligibility in Noise (잡음환경 음성명료도 향상을 위한 이진 마스크 추정 후처리 알고리즘)

Kim, Gibak
- Journal of Broadcast Engineering
- /
- v.18 no.2
- /
- pp.311-318
- /
- 2013
This paper deals with a noise reduction algorithm which uses the binary masking in the time-frequency domain. To improve speech intelligibility in noise, noise-masked speech is decomposed into time-frequency units and mask "0" is assigned to masker-dominant region removing time-frequency units where noise is dominant compared to speech. In the previous research, Gaussian mixture models were used to classify the speech-dominant region and noise-dominant region which correspond to mask "1" and mask "0", respectively. In each frequency band, data were collected and trained to build the Gaussian mixture models and detection procedure is performed to the test data where each time-frequency unit belongs to speech-dominant region or noise-dominant region. In this paper, we consider the correlation of masks in the frequency domain and propose a post-processing method which exploits the Viterbi algorithm.
https://doi.org/10.5909/JBE.2013.18.2.311 인용 PDF KSCI

On a pitch alteraton of speech technique using the asymmetry weighting (비대칭 weighting을 사용한 음성 피치변경법)

함명규;나덕수;정찬중;배명진
- Proceedings of the IEEK Conference
- /
- 1998.06a
- /
- pp.615-618
- /
- 1998
음성부호화의 주요목적은 대역 제한된 전송 대역폭에 전송을 하기위한 음성압축, 명료성과 자연성을 유지하는 고음질 음성합성, 그리고 처리 속도등의 요소에 따라 달라진다. 일반적으로 음성 부호화 방법은 파형 부호화범, 신호원 부화화법, 그리고 혼성 부호화법으로 나누어질 수 있다. 이러한 방법으로 전송되어진 음성은 다시 합성을 하게되는데, 이때 고음질을 유지할 수 있는 PSOLA법을 사용하였다. 본 논문에서 제안한 방법으로 전송되어진 음성은 다시 합성을 하게되는데, 이때 고음질에 유지 할 수 있는 PSOLA법을 사용하였다. 본 논문에서 제안한 방법은 기존의 PSOLA 합성법에서 사용되어지는 hanning 윈도우가 음성이 갖는 golttal wave shape의 특성에 적합하지 않다는 것을 이용하여 기존의 hanning 윈도우가 아닌 비대칭성을 가진 새로운 형태의 비대칭 윈도우(asymmetry window)를 제안하였다. 비대칭 윈도우의 형태는 위도우를 중심으로 왼쪽편은 기울기가 심하고, 오른쪽은 기울기가 완만하여 음성의 기울기에 적합한 웨이팅을 갖는 형태이다. 제안한 비대칭 윈도우를 사용하여 PSOLA 합성을 하였을 경우 SNR 2~3dB 정도 향상되었음을 알 수 있다.
PDF

Effects of breathing training in melodic intonation therapy on articulation intelligibility of aphasics: pilot study (멜로디 억양 치료에서 실어증 환자의 조음 명료도에 대한 호흡 훈련 효과: 초기 실험)

Kim, Seon Sik;Hong, Geum Na;Choi, Min Joo
- The Journal of the Acoustical Society of Korea
- /
- v.35 no.4
- /
- pp.319-329
- /
- 2016
The present study was to test if breathing training in melodic intonation therapy (MIT) ameliorated the articulation intelligibility of Broca's aphasics or not. The experimental group did breathing training (2 stages) that preceded the MIT. In order to evaluate the efficacy of the MIT intervention, the VOT (Voice Onset Time), the TD (Total Delay), the voice sound intensity and the expiratory volume of the subjects, closely associated with articulation intelligibility were measured before and after the intervention. It was shown that, in the experimental group after the MIT intervention, the VOT and TD were increased on bilabial/p/, alveolar consonant /t/, and soft palatal /k/(p < 0.05), but no significant differences were found on affricate /c/ and fricative /s/(p > 0.05). In the control group, no significant increases in the VOT and TD were observed on all articulation points(p > 0.05). The voice sound intensity which influences the verbal articulation increased in the experimental group after the intervention(p < 0.05), whereas no significant changes were observed in the control group. In conclusion, the breathing training in the MIT was found to result in improving the articulation intelligibility of Broca's aphasiacs.
https://doi.org/10.7776/ASK.2016.35.4.319 인용 PDF KSCI

A Study on the to Shorten of Early Decay Time in the Reverberation Curve Using MINT (MINT법을 이용한 실내 잔향곡선의 초기감쇠시간 단축에 관한 연구)

차경환
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.1
- /
- pp.37-41
- /
- 2002
In this paper, we made shorter EDT(early decay time) of room reverberation curve using multiple-channel. The speech signal was processed inverse filtering with full-band and sub-band in the basis MINT, and then the multiple-channel adaptive filters were used LMS (Least Mean Square) and NLMS (Normalized Least Mean Square) algorithm. Experimental results, we could get 1/3 of time reduction at 20dB level in the reverberation curve using full-band NLMS when two microphones were used. Also, it is shown that the speech articulation was improved 80% from the test listeners with the speech, which was to shorten EDT by MINT in the subjective assessments using real room impulse response.
PDF KSCI

A survey on noise generation and conversation interruption in cafes (카페 공간의 소음과 대화 방해에 대한 설문조사)

Jeong, Jeong-Ho
- The Journal of the Acoustical Society of Korea
- /
- v.40 no.6
- /
- pp.660-670
- /
- 2021
As various people use the cafe for various purposes, it is difficult to hear conversations with the accompanying people due to the noise and background music of people around the respondents. In addition, there is a need for improvement related to the noise and sound inside the cafe, such as making it easier to hear the conversations of nearby users. 212 adult men and women participated in the questionnaire on the survey on cafe acoustics and noise conditions. As a result of the survey, about two-thirds of the respondents said that they did not prefer noisy cafes, and that the noise of cafes had a negative effect. The major source of noise in cafes is the sound of people around users, and more than 40 % of the respondents said that they could not hear well the sound of conversations with their accompanying people due to the sounds of those around them, or that they were concerned about their own conversations being transmitted to those around them. As a result of the survey on cafe sound and noise, it was found that improvements were needed to secure the voice privacy of cafe users as well as the voice intelligibility.
https://doi.org/10.7776/ASK.2021.40.6.660 인용 PDF KSCI

A Study on Dialect Expression in Korean-Based Speech Recognition (한국어 기반 음성 인식에서 사투리 표현에 관한 연구)

Lee, Sin-hyup
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2022.05a
- /
- pp.333-335
- /
- 2022
The development of speech recognition processing technology has been applied and used in various video and streaming services along with STT and TTS technologies. However, there are high barriers to clear written expression due to the use of dialects and overlapping of stop words, exclamations, and similar words for voice recognition of actual conversation content. In this study, for ambiguous dialects in speech recognition, we propose a speech recognition technology that applies dialect key word dictionary processing method by category and dialect prosody as speech recognition network model properties.
PDF

Analysis of the Acoustic Performance of Classrooms in Korea (국내 학교 교실의 실내음향성능 실태조사)

Park, Chan-Jae;Ryu, Da-Jung;Kyoung, Ju-Young;Haan, Chan-Hoon
- The Journal of the Acoustical Society of Korea
- /
- v.33 no.5
- /
- pp.316-325
- /
- 2014
The basic unit of school is a classroom and the aural environment of the classrooms is essential factor for education purposes. Therefore, many efforts have been undertaken for enhancing the acoustical performance of the classrooms in many countries. As a result, acoustic criteria including reverberation time and background noise level have been established in US and UK for school classrooms depending on the usage and size of the rooms. However, in Korea, there has been little researches concerning the room acoustical investigations of the classrooms. The present study investigates the current situation of the aural environment of the 15 classrooms in Korea including elementary, middle and high schools. The acoustic criteria measured include RT, $D_{50}$, STI, SNR and background noise level. As the results, it was found that the background noise levels of the schools adjacent to roads exceed the US and UK standard of 35 dB(A). Also, most schools have so low SNR that they may be interfered by noise, which may affect speech transmission. It was also revealed that some schools have longer RT than the US standard of 0.6 s, but they all have high speech intelligibility.
https://doi.org/10.7776/ASK.2014.33.5.316 인용 PDF KSCI

Search Result 187, Processing Time 0.031 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)