• Title/Summary/Keyword: Speech quality

Search Result 809, Processing Time 0.023 seconds

Real-Time Implementation of the G.729.1 Using ARM926EJ-S Processor Core (ARM926EJ-S 프로세서 코어를 이용한 G.729.1의 실시간 구현)

  • So, Woon-Seob;Kim, Dae-Young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.8C
    • /
    • pp.575-582
    • /
    • 2008
  • In this paper we described the process and the results of real-time implementation of G.729.1 wideband speech codec which is standardized in SG15 of ITU-T. To apply the codec on ARM926EJ-S(R) processor core. we transformed some parts of the codec C program including basic operations and arithmetic functions into assembly language to operate the codec in real-time. G.729.1 is the standard wideband speech codec of ITU-T having variable bit rates of $8{\sim}32kbps$ and inputs quantized 16 bits PCM signal per sample at the rate of 8kHz or 16kHz sampling. This codec is interoperable with the G.729 and G.729A and the bandwidth extended wideband($50{\sim}7,000Hz$) version of existing narrowband($300{\sim}3,400Hz$) codec to enhance voice quality. The implemented G.729.1 wideband speech codec has the complexity of 31.2 MCPS for encoder and 22.8 MCPS for decoder and the execution time of the codec takes 11.5ms total on the target with 6.75ms and 4.76ms respectively. Also this codec was tested bit by bit exactly against all set of test vectors provided by ITU-T and passed all the test vectors. Besides the codec operated well on the Internet phone in real-time.

A Study on PCFBD-MPC in 8kbps (8kbps에 있어서 PCFBD-MPC에 관한 연구)

  • Lee, See-woo
    • Journal of Internet Computing and Services
    • /
    • v.18 no.5
    • /
    • pp.17-22
    • /
    • 2017
  • In a MPC coding using excitation source of voiced and unvoiced, it would be a distortion of speech waveform. This is caused by normalization of synthesis speech waveform of voiced in the process of restoration the multi-pulses of representation section. This paper present PCFBD-MPC( Position Compensation Frequency Band Division-Multi Pulse Coding ) used V/UV/S( Voiced / Unvoiced / Silence ) switching, position compensation in a multi-pulses each pitch interval and Unvoiced approximate-synthesis by using specific frequency in order to reduce distortion of synthesis waveform. Also, I was implemented that the PCFBD-MPC( Position Compensation Frequency Band Division-Multi Pulse Coding ) system and evaluate the SNRseg of PCFBD-MPC in coding condition of 8kbps. As a result, SNRseg of PCFBD-MPC was 13.4dB for female voice and 13.8dB for male voice respectively. In the future, I will study the evaluation of the sound quality of 8kbps speech coding method that simultaneously compensation the amplitude and position of multi-pulse source. These methods are expected to be applied to a method of speech coding using sound source in a low bit rate such as a cellular phone or a smart phone.

A study on the Institutionalization of Speech-to-text Services for the Deaf People (난청인을 위한 문자통역서비스 제도화 연구)

  • Chun, Dong-Il;Seo, Jeong-Min
    • Journal of Digital Convergence
    • /
    • v.15 no.4
    • /
    • pp.53-63
    • /
    • 2017
  • The purpose of this study is to look at the way that speech-to-text (STT) services are used at present, and to explore measures to institutionalize such services for ease of communication for the hearing impaired. The results of this study show the following: 1) 17.8% of those surveyed had experience of using STT services, with younger individuals showing a higher rate of use; and 2) In terms of organizations providing STT services, social welfare organizations followed by civic groups (18.3%) and public organizations (18.3%). The following institutional measures are needed for STT services. First, STT services should be actively promoted as one of the reasonable conveniences defined in the 'Act on the Prohibition of Discrimination Against Disabled Persons, Remedy Against Infringement of Their Rights, etc.' Second, STT services should be additionally listed as one of the clauses of the 'Act on Welfare of Persons with Disabilities'. In particular, establishing a communication system for those with hearing impairments should serve as a catalyst for integration with sign language interpretation and welfare services. If STT services for face-to-face contacts can be improved or further enhanced using ICT, it will not only open the way for a new influx of disabled workers to join vocational rehabilitation, but also help to improve quality of life for the hearing impaired.

An ACLMS-MPC Coding Method Integrated with ACFBD-MPC and LMS-MPC at 8kbps bit rate. (8kbps 비트율을 갖는 ACFBD-MPC와 LMS-MPC를 통합한 ACLMS-MPC 부호화 방식)

  • Lee, See-woo
    • Journal of Internet Computing and Services
    • /
    • v.19 no.6
    • /
    • pp.1-7
    • /
    • 2018
  • This paper present an 8kbps ACLMS-MPC(Amplitude Compensation and Least Mean Square - Multi Pulse Coding) coding method integrated with ACFBD-MPC(Amplitude Compensation Frequency Band Division - Multi Pulse Coding) and LMS-MPC(Least Mean Square - Multi Pulse Coding) used V/UV/S(Voiced / Unvoiced / Silence) switching, compensation in a multi-pulses each pitch interval and Unvoiced approximate-synthesis by using specific frequency in order to reduce distortion of synthesis waveform. In integrating several methods, it is important to adjust the bit rate of voiced and unvoiced sound source to 8kbps while reducing the distortion of the speech waveform. In adjusting the bit rate of voiced and unvoiced sound source to 8 kbps, the speech waveform can be synthesized efficiently by restoring the individual pitch intervals using multi pulse in the representative interval. I was implemented that the ACLMS-MPC method and evaluate the SNR of APC-LMS in coding condition in 8kbps. As a result, SNR of ACLMS-MPC was 15.0dB for female voice and 14.3dB for male voice respectively. Therefore, I found that ACLMS-MPC was improved by 0.3dB~1.8dB for male voice and 0.3dB~1.6dB for female voice compared to existing MPC, ACFBD-MPC and LMS-MPC. These methods are expected to be applied to a method of speech coding using sound source in a low bit rate such as a cellular phone or internet phone. In the future, I will study the evaluation of the sound quality of 6.9kbps speech coding method that simultaneously compensation the amplitude and position of multi-pulse source.

Voice therapy for pitch problems following thyroidectomy without laryngeal nerve injury (신경학적 손상이 없는 갑상선 술 후 음도문제의 음성치료)

  • Ji-sung Kim;Mi-jin Kim
    • Phonetics and Speech Sciences
    • /
    • v.15 no.3
    • /
    • pp.53-58
    • /
    • 2023
  • After thyroidectomy, some patients who show normal vocal cord movement still complain of subjective voice problems, which could lead to a decrease in quality of life related to communication. This study aims to investigate the effectiveness of a newly designed voice therapy applying neck exercise and semi-occluded vocal tract exercise (SOVTE) to improve voice problems after thyroidectomy without neurological injury. For this purpose, voice therapy was randomly assigned to 10 women who received thyroidectomy. Acoustic analysis [fundamental frequency, jitter, shimmer, noise-to-harmonics ratio, min Voice Range Profile (VRP), max VRP, VRP] was performed before and after surgery and immediately after voice therapy to compare voice changes. The study showed a statistically significant increase in max VRP and VRP after voice therapy compared to before surgery. These results suggest that the voice therapy methods in this study effectively improve a major symptom of voice problems after thyroidectomy, specifically the reduction in the high-frequency range. However, this study was limited in the number of s participants and did not control for the type of surgery. Therefore, further research utilizing larger sample sizes and controlled variables is needed to investigate the long-term effects of voice therapy.

Improved ErtPS Scheduling Algorithm for AMR Speech Codec with CNG Mode in IEEE 802.16e Systems (IEEE 802.16e 시스템에서의 CNG 모드 AMR 음성 코덱을 위한 개선된 ErtPS 스케줄링 알고리즘)

  • Woo, Hyun-Je;Kim, Joo-Young;Lee, Mee-Jeong
    • The KIPS Transactions:PartC
    • /
    • v.16C no.5
    • /
    • pp.661-668
    • /
    • 2009
  • The Extended real-time Polling Service (ErtPS) is proposed tosupport QoS of VoIP service with silence suppression which generates variable size data packets in IEEE 802.16e systems. If the silence is suppressed, VoIP should support Comfort Noise Generation (CNG) which generates comfort noise for receiver's auditory sense to notify the status of connection to the user. CNG mode in silent-period generates a data with lower bit rate at long packet transmission intervals in comparison with talk-spurt. Therefore, if the ErtPS, which is designed to support service flows that generate data packets on a periodic basis, is applied to silent-period, resources of the uplink are used inefficiently. In this paper, we proposed the Improved ErtPS algorithm for efficient resource utilization of the silent-period in VoIP traffic supporting CNG. In the proposed algorithm, the base station allocates bandwidth depending on the status of voice at the appropriate interval by havingthe user inform the changes of voice status. The Improved ErtPS utilizes the Cannel Quality Information Channel (CQICH) which is an uplink subchannel for delivering quality information of channel to the base station on a periodic basis in 802.16e systems. We evaluated the performance of proposed algorithm using OPNET simulator. We validated that proposed algorithm improves the bandwidth utilization of the uplink and packet transmission latency

A Study on the Improve of Speech Quality Using the Dual Tx Ant. (Dual Tx Ant.를 이용한 통화품질 개선에 관한 연구)

  • Kim, Song-Min;Kim, Ihn-Hwan
    • Journal of the Korean Institute of Telematics and Electronics T
    • /
    • v.35T no.2
    • /
    • pp.72-80
    • /
    • 1998
  • Most cellsite are operated by 3-sector system in these days. Due to the weak propagating signal caused by peculiar topography, there exist weak or shadow regions. To solve this problem, we need double beam in a sector, thus we have to design 4-sector cellisite even if investment cost and maintenance cost gets high. We propose here using the dual TX-ANT instead of designing 4-sector cellsite to reduse investment cost and to improve call quality. We choose YangPeong and Gepo cellisite in Seoul and DuckCheon cellsite in Pusan to run the experiments. And we verified the practicality by comparing data before and before and after the installation of dual TX-ANT.

  • PDF

Decoding Brain States during Auditory Perception by Supervising Unsupervised Learning

  • Porbadnigk, Anne K.;Gornitz, Nico;Kloft, Marius;Muller, Klaus-Robert
    • Journal of Computing Science and Engineering
    • /
    • v.7 no.2
    • /
    • pp.112-121
    • /
    • 2013
  • The last years have seen a rise of interest in using electroencephalography-based brain computer interfacing methodology for investigating non-medical questions, beyond the purpose of communication and control. One of these novel applications is to examine how signal quality is being processed neurally, which is of particular interest for industry, besides providing neuroscientific insights. As for most behavioral experiments in the neurosciences, the assessment of a given stimulus by a subject is required. Based on an EEG study on speech quality of phonemes, we will first discuss the information contained in the neural correlate of this judgement. Typically, this is done by analyzing the data along behavioral responses/labels. However, participants in such complex experiments often guess at the threshold of perception. This leads to labels that are only partly correct, and oftentimes random, which is a problematic scenario for using supervised learning. Therefore, we propose a novel supervised-unsupervised learning scheme, which aims to differentiate true labels from random ones in a data-driven way. We show that this approach provides a more crisp view of the brain states that experimenters are looking for, besides discovering additional brain states to which the classical analysis is blind.

Analysis and Synthesis of Audio Signals using a Sinusoidal Model with Psychoacoustic Criteria (정현파 모델을 이용한 오디오 신호의 심리음향적 분석 및 합성)

  • 남승현;강경옥;홍진우
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.2
    • /
    • pp.77-82
    • /
    • 1999
  • A sinusoidal model has been widely used in the analysis and synthesis of speech and audio signals, and becomes one of the efficient candidates for high quality low bit rate audio coders. One of the crucial steps in the analysis and synthesis using a sinusoidal model is the detection of tonal components. This paper proposes an efficient method for the analysis and synthesis of audio signals using a sinusoidal model, which uses psychoacoustic criteria such as masking effect, masking index, and JNDf(Just Noticeable Difference in Frequency). Simulation results show that the proposed method reduces the number of sinusoids significantly without degrading the quality of the synthesized audio signals.

  • PDF

Impact of experience on government policy toward acceptance of Hydrogen fuel cell vehicles (정부정책에 대한 경험이 수소 연료전지 자동차의 수용에 미치는 영향)

  • Gang, Min-Jeong;Park, Hui-Jun
    • Proceedings of the Korean Society for Quality Management Conference
    • /
    • 2010.04a
    • /
    • pp.465-470
    • /
    • 2010
  • Korea government declared that "low carbon, green growth" through green technologies and clean energy to be the new national vision for the next 60 years(President's Liberation Day speech on Aug. 15, 2008). And succeeding "Green New Deal" plan involves nine core projects including energy saving, recycling, clean energy development. It is because hydrogen fuel cell vehicles, using electricity from chemical reaction of hydrogen and oxygen, let out water which is a by-product of such chemical reaction instead of emitting harmful particulate and gases such as NOX, SOX and CO2 that hydrogen fuel cell vehicles and its technology are drawing public attention as one of the sensible solutions in accomplishing "low carbon, green growth" agenda. Nevertheless There are many chances that let the people have a practical experience of hydrogen fuel cell vehicles. Sometimes new products, including hydrogen fuel cell vehicles, made by advanced technology can not penetrate through the market when it faces public skepticism that is stimulated from lack of knowledge and experience. That is the reason why not only cost benefit analyses and scientific risk assessments but also public acceptance studies toward hydrogen fuel cell vehicles have to be performed [Schulte, 2004]. This research address a need for comprehensive study on factors influencing public acceptance of hydrogen fuel cell car, specifically focusing on impacts of personal experience related to governmental science and technology policy toward public acceptance.

  • PDF