Search | Korea Science

Model adaptation employing DNN-based estimation of noise corruption function for noise-robust speech recognition (잡음 환경 음성 인식을 위한 심층 신경망 기반의 잡음 오염 함수 예측을 통한 음향 모델 적응 기법)

Yoon, Ki-mu;Kim, Wooil
- The Journal of the Acoustical Society of Korea
- /
- v.38 no.1
- /
- pp.47-50
- /
- 2019
This paper proposes an acoustic model adaptation method for effective speech recognition in noisy environments. In the proposed algorithm, the noise corruption function is estimated employing DNN (Deep Neural Network), and the function is applied to the model parameter estimation. The experimental results using the Aurora 2.0 framework and database demonstrate that the proposed model adaptation method shows more effective in known and unknown noisy environments compared to the conventional methods. In particular, the experiments of the unknown environments show 15.87 % of relative improvement in the average of WER (Word Error Rate).
https://doi.org/10.7776/ASK.2019.38.1.047 인용 PDF KSCI HTML

A Korean menu-ordering sentence text-to-speech system using conformer-based FastSpeech2 (콘포머 기반 FastSpeech2를 이용한 한국어 음식 주문 문장 음성합성기)

Choi, Yerin;Jang, JaeHoo;Koo, Myoung-Wan
- The Journal of the Acoustical Society of Korea
- /
- v.41 no.3
- /
- pp.359-366
- /
- 2022
In this paper, we present the Korean menu-ordering Sentence Text-to-Speech (TTS) system using conformer-based FastSpeech2. Conformer is the convolution-augmented transformer, which was originally proposed in Speech Recognition. Combining two different structures, the Conformer extracts better local and global features. It comprises two half Feed Forward module at the front and the end, sandwiching the Multi-Head Self-Attention module and Convolution module. We introduce the Conformer in Korean TTS, as we know it works well in Korean Speech Recognition. For comparison between transformer-based TTS model and Conformer-based one, we train FastSpeech2 and Conformer-based FastSpeech2. We collected a phoneme-balanced data set and used this for training our models. This corpus comprises not only general conversation, but also menu-ordering conversation consisting mainly of loanwords. This data set is the solution to the current Korean TTS model's degradation in loanwords. As a result of generating a synthesized sound using ParallelWave Gan, the Conformer-based FastSpeech2 achieved superior performance of MOS 4.04. We confirm that the model performance improved when the same structure was changed from transformer to Conformer in the Korean TTS.
https://doi.org/10.7776/ASK.2022.41.3.359 인용 PDF KSCI

A Study on Spam Protection Technolgy for Secure VoIP Service in Broadband convergence Network Environment (BcN 환경에서 안전한 VoIP 서비스를 위한 스팸대응 기술 연구)

Sung, Kyung;Kim, Seok-Hun
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.12 no.4
- /
- pp.670-676
- /
- 2008
There is a difficult plane letting a security threat to occur in Internet networks as VoIP service uses technology-based the Internet is inherent, and you protect without adjustment of the existing security solution or changes with real-time service characteristics. It is a voice to single networks The occurrence security threat that it is possible is inherent in IP networks that effort and cost to protect a data network only are complicated relatively as provide service integrated data. This paper about various response way fields to be able to prevent analysis regarding definition regarding VoIP spam and VoIP spam technology and VoIP spam.
https://doi.org/10.6109/jkiice.2008.12.4.670 인용 PDF KSCI

Trends of Voice Quality Measurement for VoIP Service (VoIP 서비스를 위한 음성 품질 평가 기술 동향)

Jung, O.J.;Park, J.Y.;Kang, S.G.
- Electronics and Telecommunications Trends
- /
- v.19 no.3 s.87
- /
- pp.136-144
- /
- 2004
인터넷의 발달 및 VoIP의 보급으로 인해 VoIP 서비스의 품질에 대한 관심이 증가하고 있다. 그 동안은 망사업자 관점에서 망의 품질을 개선하기 위한 MPLS, Diffserv, RSVP 등의 연구가 진행되어 왔으나, 실제로 서비스 품질은 망뿐만 아니라 단말 등의 품질에도 영향을 받기 때문에 망 사업자의 관점에서 보는 서비스 품질 기준이 아닌, 고객의 관점에서 인식 가능한 수준에서의 종단간 서비스 품질을 다룰 필요가 있다. 본 고에서는 서비스 품질이란 무엇인지 살펴보고, 국제표준단체의 서비스 품질 관련 연구 및 VoIP 서비스를 위한 음성 품질 평가 기술에 대하여 살펴본다.
https://doi.org/10.22648/ETRI.2004.J.190314 인용 PDF

Next-generation Metro Transmission Trend (차세대 메트로 광 전송망의 발전 방향)

Park Seung-Byoung
- 한국정보통신설비학회:학술대회논문집
- /
- 2004.08a
- /
- pp.54-58
- /
- 2004
90년대 중,후반 이후 최근 몇 년 전까지 기간망 사업자와 회선사업자를 중심으로 메트로 구간 및 롱홀(시외)구간에 엄청난 투자가 이루어져 왔다. 특히 롱홀(시외)구간에는 그 투자가 더욱 집중되어왔다. 이러한 투자는 기존의 음성 및 전용회선 서비스의 고속화와 더불어 인터넷 트래픽의 폭증이 그 주 원인이 되었다. 90년대 말에 들어서면서 인터넷 트래픽의 고속화로 음성 트래픽을 월등히 추월하였다. 이러한 트래픽들은 롱홀 구간보다 메트로 구간에 집중되어 서울 및 지방 광역시에 엄청난 트래픽을 유발하였다.
PDF

A Design of IP-PBX System for Integrated Service (효율적인 통신을 위한 통합형 IP-PBX 시스템 설계)

최준원;백승범;최재원
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2003.10a
- /
- pp.164-167
- /
- 2003
정보통신 기술이 다양하게 발달함에 따라 인터넷망을 통한 음성통신 및 데이터 통신의 필요성이 대두되고 있다. 인터넷망을 통해 음성통신뿐만 아니라 데이터통신등을 단일 형태로 전달하는 것 외에 멀티미디어 정보를 복수의 통신로를 통해 전달할 수 있는 교환시스템을 설계 하였다.
PDF

A Study on the Wireless Intra-net Implementation using an Internet Phone (인터넷폰을 이용한 무선 Intra-net 구축에 관한 연구)

박윤종;곽승욱;이치문;강경인;김현주;이광배;김현욱
- Proceedings of the Safety Management and Science Conference
- /
- 1999.11a
- /
- pp.157-168
- /
- 1999
인터넷폰을 기반으로 한 무선상에서 데이터의 높은 신뢰성과 망 구축을 구현하고자 한다. 음성신호에 대한 음성질을 평가하여 보다 나은 전송률과 신뢰성을 검증하도록 하였다. 그리고 현 실생활에 적용될 수 있는 생산성을 평가하고 검증하는데 중점들 두도록 하였다. 이에 본 연구에서는 최근에 급부상하고 있는 정보통신 분야의 핵심이 될 수 있는 유ㆍ무선 네트워크 데이터 전송은 최적의 신뢰성과 유용성을 보장하는 망 구축을 목표로 하고 있다.
PDF

QoS mesurements VoIP on HFC Networks (HFC망에서 VoIP의 QoS 측정에 관한 연구)

조성봉;이경근
- Proceedings of the Korean Information Science Society Conference
- /
- 2003.04d
- /
- pp.280-282
- /
- 2003
본 연구에서는 HFC망에서 음성 통화품질에 영향을 주는 파라미터들에 대한 분석과 측정을 통해 VoIP의 특성을 파악하여 향후 본격적으로 이루어질 인터넷을 통한 음성 서비스 제공에 대비하였다 특히, VoIP QoS 기준의 정의는 QoS 파라미터의 분석을 토대로 한 lab 테스트와 HFC VoIP 서비스의 필드 측정을 병행하여 연구 결과에 대한 실질적 서비스 적용에 따르는 신뢰성을 높였다.
PDF

A Method of Scaling Time-Delay Neural Networks for Korean Allophone Recognition (한국어 변이음 인식을 위한 시간지연 신경망의 확장방법)

김수일
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1994.06c
- /
- pp.229-234
- /
- 1994
본 논문에서는 한국어 변이음을 인식하기 위한 시간지연 신경망의 확장 방법을 살펴보고 한국어 파열음의 벼이음을 인식하는 실험을 통해 각 확장 방법의 인식 성능을 비교한다. 먼저 변이음을 연속음성인식의 인식단위로 사용하기 위하여 한 음소이모든 변이음을 고려하면서 서로 유사한 변이음을 통합 분류하여 3개의 변이음 군으로 나눈다. 한국어 파열음에 대한 인식 실험결과, 음향 음성학적인 특성에 따라 나누어진 trbah 시간지연 신경망들을 모듈 별로 학습한 후, 계층적으로 통합하여 전체적인 시간지연 신경망을 구성하는 방법이 가장 좋은 성능을 나타내었다. 또한, 변이음 단위 인식이 음소 단위 인식에서 문제가 되는 조음 결합 현상을 해결할 수 있음을 확인하였고, 변이음 인식의 결과인 변이음 열이 제공하는 부가적인 정보를 음운파상에 이용하는 방법에 대해 고찰하였다.
PDF

Artificial speech bandwidth extension technique based on opus codec using deep belief network (심층 신뢰 신경망을 이용한 오푸스 코덱 기반 인공 음성 대역 확장 기술)

Choi, Yoonsang;Li, Yaxing;Kang, Sangwon
- The Journal of the Acoustical Society of Korea
- /
- v.36 no.1
- /
- pp.70-77
- /
- 2017
Bandwidth extension is a technique to improve speech quality, intelligibility and naturalness, extending from the 300 ~ 3,400 Hz narrowband speech to the 50 ~ 7,000 Hz wideband speech. In this paper, an Artificial Bandwidth Extension (ABE) module embedded in the Opus audio decoder is designed using the information of narrowband speech to reduce the computational complexity of LPC (Linear Prediction Coding) and LSF (Line Spectral Frequencies) analysis and the algorithm delay of the ABE module. We proposed a spectral envelope extension method using DBN (Deep Belief Network), one of deep learning techniques, and the proposed scheme produces better extended spectrum than the traditional codebook mapping method.
https://doi.org/10.7776/ASK.2017.36.1.070 인용 PDF KSCI

Search Result 874, Processing Time 0.033 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)