• Title/Summary/Keyword: 음성코덱

Search Result 119, Processing Time 0.032 seconds

Study of Error Reconstruction Algorithm for Real-time Voice for Transmissions over the Internet (인터넷상의 실시간 음성 전송을 위한 에러 복원 알고리즘의 연구)

  • 신현숙;최연성
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2001.05a
    • /
    • pp.388-394
    • /
    • 2001
  • In this paper, a large number of algorithm have been proposed for error concealment and reconstruction real-time voice transmission for over the internet. The main purpose of this algorithm perform error reconstruction using low bandwidth and then guarantee good voice quality. Error concealment algorithm can be classified into receiver-based and sender- and receiver-based. In this paper, we apply the sender - and receiver-based reconstruction algorithm to low bit rate codec using CELP.

  • PDF

Call Traffic Control Strategy Using VoIP in 5G Era (5G시대 VoIP를 활용한 통화 트래픽 제어 전략)

  • Ham, Hyung-Bin;Jung, Jun-Kwon;Chung, Tai-Myoung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.10a
    • /
    • pp.8-10
    • /
    • 2018
  • 우리는 5G 시대를 눈앞에 두고 있다. 5G는 더 빠른 것뿐만이 아니라, 더 저렴한 데이터 서비스를 제공하게 된다. 하지만 4G는 이동 통신망이 모바일 단말 간의 통신에 최적화된 망 구조였다면, 5G는 서로 다른 속성을 갖는 다양한 대상에게 서비스를 제공해주어야 한다. 이 환경에서 스마트폰을 기반으로 한 모바일 통신은 5G 통신망의 비주류 통신영역이 되며, 전체적인 할당량 또한 줄어들 것이다. 이때 통신망의 제약으로 인해 통화장애가 발생한다면 이를 효과적으로 관리하여 통화자체가 끊어지지 않도록 유지할 수 있어야 한다. 본 논문에서는 VoIP를 기반으로 한 통화 트래픽 제어 전략을 제시한다. 5G의 빠른 속도와 VoIP의 음성코덱 제어를 통해 순간 통화 연결시도가 늘어 트래픽의 한계상황에서도 서비스가 끊어지지 않고 유지될 것이다.

A Preprocessing Approach to Improving the Quality of the Music Produced by the EVRC (EVRC 코덱으로 재생하는 음악의 품질을 개선하기 위한 전처리 기법)

  • 남영한;하태균;전윤호;김재수;박섭형
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.28 no.5C
    • /
    • pp.476-485
    • /
    • 2003
  • This paper proposers a preprocessing approach to improving the quality of the music produced by the EVRC(enhanced variable rate codec) which is one of the CDMA(Code Division Multiple Access) voice codecs. Since the EVRC is optimized only for speech signals, it can deteriorate the quality of the music passed through it. One of the problems with the EVRC-coded music is time-clipping, which usually occurs when subsequent frames are encoded at Rate l/8. Since the EVRC determines the bit rate for an input frame based on the long-term prediction gain, we increase the long-term prediction gain in order for the most of the frames to be encoded at Rate 1 or Rate 1/2. Experimental results show that the approach works well on music signals and the number of time-clipped frames is considerably reduced.

Implementation of Embedded Speech Recognition System for Supporting Voice Commander to Control an Audio and a Video on Telematics Terminals (텔레메틱스 단말기 내의 오디오/비디오 명령처리를 위한 임베디드용 음성인식 시스템의 구현)

  • Kwon, Oh-Il;Lee, Heung-Kyu
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.42 no.11
    • /
    • pp.93-100
    • /
    • 2005
  • In this paper, we implement the embedded speech recognition system to support various application services such as audio and video control using speech recognition interface on cars. The embedded speech recognition system is implemented and ported in a DSP board. Because MIC type and speech codecs affect the accuracy of speech recognition. And also, we optimize the simulation and test environment to effectively remove the real noises on a car. We applied a noise suppression and feature compensation algorithm to increase an accuracy of sppech recognition on a car. And we used a context dependent tied-mixture acoustic modeling. The performance evaluation showed high accuracy of proposed system in office environment and even real car environment.

The Reduction Algorithm of Complexity using Adjustment of Resolution and Search Sequence for Vocoder (해상도 조절과 검색순서 조절을 통한 음성부호화기용 복잡도 감소 알고리즘)

  • Min, So-Yeon;Lee, Kwang-Hyoung;Bae, Myung-Jin
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.8 no.5
    • /
    • pp.1122-1127
    • /
    • 2007
  • We propose the complexity reduction algorithm of real root method that is mainly used in the Vocoder. The real root method is that if polynomial equations have the real roots, we are able to find those and transform them into LSP(Line Spectrum Pairs). However, this method takes much time to compute, because the root searching is processed sequentially in frequency region. The important characteristic of LSP is that most of coefficients are occurred in specific frequency region. So, the searching frequency region is ordered and adjusted by each coefficient's distribution in this paper. Transformation time can be reduced by proposed algorithm than the sequential searching method in frequency region. When we compare this proposed method with the conventional real root method, the experimental result is that the searching time was reduced about 48% in average.

  • PDF

Frequency Band Selection Exited Linear Prediction Wideband Speech/Audio Coding Using SBR (SBR을 이용한 주파수 밴드선택 여기 선형예측 광대역 음성/오디오 부호화)

  • Jang, Sunghoon;Lee, Insung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.32 no.6
    • /
    • pp.556-562
    • /
    • 2013
  • This paper is aimed to improve performance of Band-Selection speech/audio Coder reconstucted band spectrum that is not sent by the comfort noise. To improve the performance, we use the Spectral Band Replication(SBR) technique instead of substitution of Comfort noise. To synthesize SBR signal, the SBR algorithm is referenced in selected signals and the spectrum synthesized by SBR is injected to non-selected band. Each sub-band spectrum has been energy-weighted by real audio signal. We propose the enhanced the Band-Selection Coder that utilizes synthesized SBR signal from selected signal instead of comfort noise.

Voice Synthesis Detection Using Language Model-Based Speech Feature Extraction (언어 모델 기반 음성 특징 추출을 활용한 생성 음성 탐지)

  • Seung-min Kim;So-hee Park;Dae-seon Choi
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.34 no.3
    • /
    • pp.439-449
    • /
    • 2024
  • Recent rapid advancements in voice generation technology have enabled the natural synthesis of voices using text alone. However, this progress has led to an increase in malicious activities, such as voice phishing (voishing), where generated voices are exploited for criminal purposes. Numerous models have been developed to detect the presence of synthesized voices, typically by extracting features from the voice and using these features to determine the likelihood of voice generation.This paper proposes a new model for extracting voice features to address misuse cases arising from generated voices. It utilizes a deep learning-based audio codec model and the pre-trained natural language processing model BERT to extract novel voice features. To assess the suitability of the proposed voice feature extraction model for voice detection, four generated voice detection models were created using the extracted features, and performance evaluations were conducted. For performance comparison, three voice detection models based on Deepfeature proposed in previous studies were evaluated against other models in terms of accuracy and EER. The model proposed in this paper achieved an accuracy of 88.08%and a low EER of 11.79%, outperforming the existing models. These results confirm that the voice feature extraction method introduced in this paper can be an effective tool for distinguishing between generated and real voices.

A Study on the Frequency Scaling Methods Using LSP Parameters Distribution Characteristics (LSP 파라미터 분포특성을 이용한 주파수대역 조절법에 관한 연구)

  • 민소연;배명진
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.3
    • /
    • pp.304-309
    • /
    • 2002
  • We propose the computation reduction method of real root method that is mainly used in the CELP (Code Excited Linear Prediction) vocoder. The real root method is that if polynomial equations have the real roots, we are able to find those and transform them into LSP. However, this method takes much time to compute, because the root searching is processed sequentially in frequency region. In this paper, to reduce the computation time of real root, we compare the real root method with two methods. In first method, we use the mal scale of searching frequency region that is linear below 1 kHz and logarithmic above. In second method, The searching frequency region and searching interval are ordered by each coefficient's distribution. In order to compare real root method with proposed methods, we measured the following two. First, we compared the position of transformed LSP (Line Spectrum Pairs) parameters in the proposed methods with these of real root method. Second, we measured how long computation time is reduced. The experimental results of both methods that the searching time was reduced by about 47% in average without the change of LSP parameters.

Implementation of the Timbre-based Emotion Recognition Algorithm for a Healthcare Robot Application (헬스케어 로봇으로의 응용을 위한 음색기반의 감정인식 알고리즘 구현)

  • Kong, Jung-Shik;Kwon, Oh-Sang;Lee, Eung-Hyuk
    • Journal of IKEEE
    • /
    • v.13 no.4
    • /
    • pp.43-46
    • /
    • 2009
  • This paper deals with feeling recognition from people's voice to fine feature vectors. Voice signals include the people's own information and but also people's feelings and fatigues. So, many researches are being progressed to fine the feelings from people's voice. In this paper, We analysis Selectable Mode Vocoder(SMV) that is one of the standard 3GPP2 codecs of ETSI. From the analyzed result, we propose voices features for recognizing feelings. And then, feeling recognition algorithm based on gaussian mixture model(GMM) is proposed. It uses feature vectors is suggested. We verify the performance of this algorithm from changing the mixture component.

  • PDF

The Implementation of an ISDN System-on-a-Chip and communication terminal (ISDN 멀티미디어 통신단말용 시스템-온-칩 및 소프트웨어 구현)

  • 김진태;황대환
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.6 no.3
    • /
    • pp.410-415
    • /
    • 2002
  • This paper describes the implementation of a SoC(System-on-a-Chip) and an ISDN communication terminal by the SoC in ISDN network. The SoC has been developed with the functions of 32-bit ARM7TDMI RISC core processor, network connection with S/T interface, TDM--bus interface and voice codec, user interface. And we also review the developed software structure and the ISDN service protocol procedures which are working on the SoC. And finally this paper describers a structure of an ISDN terminal equipment using the implemented SoC and terminal software.