Search | Korea Science

Wideband Speech Coding Algorithm with Application of Wavelet Transform (웨이브렛 변환을 적용한 광대역 음성부호화 알고리즘)

이승원;배건성
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.5
- /
- pp.462-470
- /
- 2002
Wideband speech, characterized by a bandwidth of 50∼7000 ㎐, sounds more natural and intelligible, and is less tiring to listen to when compared to narrowband speech characterized by a bandwidth of 300∼3400 ㎐. Wideband speech coders, however, have not been as successful as the narrowband speech coders because of their higher bit rate. In this paper, we propose a new wideband speech coder which combines the European standard of a narrowband speech coder, i.e., GSM-EFR, and a transform coder using the discrete wavelet transform. The proposed wideband speech coder operates as follows input speech is first split into two subbands with equal bandwidth and the two subband signals are coded and decoded by each subband coder. A GSM-EFR is adopted as a lower subband coder and a subband coder with wavelet transformed speech is designed for a upper subband coder. The total bit rate of the proposed coder is 18.9kbps (12.2 kbps for lower band coder and 6.7 kbps for upper band coder), and informal listening test results have shown that the proposed coder has comparable speech quality to that of G.722 with 56 kbps.
PDF KSCI

Multi Mode Harmonic Transform Coding for Speech and Music

Kim, Jonghark;Shin, Jae-Hyun;Lee, Insung
- The Journal of the Acoustical Society of Korea
- /
- v.22 no.3E
- /
- pp.101-109
- /
- 2003
A multi-mode harmonic transform coding (MMHTC) for speech and music signals is proposed. Its structure is organized as a linear prediction model with an input of harmonic and transform-based excitation. The proposed coder also utilizes harmonic prediction and an improved quantizer of excitation signal. To efficiently quantize the excitation of music signals, the modulated lapped transform(MLT) is introduced. In other words, the coder combines both the time domain (linear prediction) and the frequency domain technique to achieve the best perceptual quality. The proposed coder showed better speech quality than that of the 8 kbps QCELP coder at a bit-rate of 4 kbps.
PDF KSCI

Design of video encoder using Multi-dimensional DCT (다차원 DCT를 이용한 비디오 부호화기 설계)

Jeon, S.Y.;Choi, W.J.;Oh, S.J.;Jeong, S.Y.;Choi, J.S.;Moon, K.A.;Hong, J.W.;Ahn, C.B.
- Journal of Broadcast Engineering
- /
- v.13 no.5
- /
- pp.732-743
- /
- 2008
In H.264/AVC, 4$\times$4 block transform is used for intra and inter prediction instead of 8$\times$8 block transform. Using small block size coding, H.264/AVC obtains high temporal prediction efficiency, however, it has limitation in utilizing spatial redundancy. Motivated on these points, we propose a multi-dimensional transform which achieves both the accuracy of temporal prediction as well as effective use of spatial redundancy. From preliminary experiments, the proposed multi-dimensional transform achieves higher energy compaction than 2-D DCT used in H.264. We designed an integer-based transform and quantization coder for multi-dimensional coder. Moreover, several additional methods for multi-dimensional coder are proposed, which are cube forming, scan order, mode decision and updating parameters. The Context-based Adaptive Variable-Length Coding (CAVLC) used in H.264 was employed for the entropy coder. Simulation results show that the performance of the multi-dimensional codec appears similar to that of H.264 in lower bit rates although the rate-distortion curves of the multi-dimensional DCT measured by entropy and the number of non-zero coefficients show remarkably higher performance than those of H.264/AVC. This implies that more efficient entropy coder optimized to the statistics of multi-dimensional DCT coefficients and rate-distortion operation are needed to take full advantage of the multi-dimensional DCT. There remains many issues and future works about multi-dimensional coder to improve coding efficiency over H.264/AVC.
https://doi.org/10.5909/JBE.2008.13.5.732 인용 PDF KSCI

Efficient Transform Coefficient Coding for the HEVC Intra Frame Coder (HEVC 화면내 부호기를 위한 효율적인 변환 계수 부호화 방법)

Choi, Jung A;Ho, Yo Sung
- Smart Media Journal
- /
- v.1 no.2
- /
- pp.6-11
- /
- 2012
In the HEVC standard, transform coefficient coding that affects the output bitstream directly is a core part of the encoder and it includes coefficient scanning and entropy coding. Recently, JCT-VC(Joint Collaborative Team on Video Coding) advances to HEVC Committee Draft (CD). In this paper, we explain HEVC transform coefficient coding and propose an efficient transform coefficient coding method considering statistics of transform coefficients in the intra frame coder. The proposed method reduces BD-Rate by up to 0.74%, compared to the conventional HEVC transform coefficient coding.
PDF

Design of Low Bits Rate Transform Excitation Wide Band Speech and Audio Coder of Analysis-by-Synthesis Structure (분석/합성 구조의 저 전송률 변환여기 광대역 음성/오디오 부호화기 설계)

Jang, Sunghoon;Hong, Kibong;Lee, Insung
- The Journal of the Acoustical Society of Korea
- /
- v.31 no.7
- /
- pp.472-479
- /
- 2012
This paper is aimed to design 9.2 kbps low bits late transform excitation coder that target to voice and audio signal. To set up low bit rate, we used Band-selection in frequency domain and gain-shape quantization and AbS structure. To decrease lots of calculation from ABS structure, we used each band IDFT and synthesis. And we designed non-transfer band for performance by inserting comfort noise. We propose coder that has low bit rate and similar performance comparing with original 10.4 kbps AMR-WB+ TCX mode.
https://doi.org/10.7776/ASK.2012.31.7.472 인용 PDF KSCI

Design of a Variable Bit Rate Speech Coder Based on One-dimensional SPIHT (1차원 SPIHT를 이용한 가변 비트율 음성 부호기의 설계)

Na, Hoon;Jeong, Dae-Gwon
- The Journal of the Acoustical Society of Korea
- /
- v.22 no.6
- /
- pp.443-451
- /
- 2003
Since a codebook-based CELP coder models its excitation signal according to one of several bit rates pre-assigned to codebooks and synthesizes speech signal using codebooks, it can not support encoding of speech signal at an arbitrary bit rate in one encoder. The proposed variable bit rate speech coder encodes the excitation signal based on the bit rate assigned to a present frame of speech using one-dimensional SPIHT and wavelet transform. Also it does't need to model excitation signal (or codebook) to some types as CELP coder, and can encode excitation signal at various bit rates without exact pitch information according to user requirement. As a result, since the coder doesn't have a codebook structure, it has relatively low coder complexity and provides equal or better speech quality compared to G.729 and G.723.1 coder.
PDF KSCI

SPIHT Video Coder Using Perceptual Weight in Wavelet transform (웨이브릿 변환에서 인지적 가중치를 이용한 SPIHT 비디오 부호기)

정용재;강경원;문광석
- Journal of the Institute of Convergence Signal Processing
- /
- v.3 no.1
- /
- pp.15-20
- /
- 2002
The frame coding inside the screen for a video coder has a big influence on the quality of the whole frame. The standardized video coder uses DCT, however it can give rise to a low image quality due to the blocking effect at low bit rate. This paper proposes a video coding method for an image quality improvement in human visual aspects. With the proposed method, the perceptual weight is coded with SPIHT and VLC by applying it into the frame and the visual noises are eliminated.
PDF

Design of Wideband Speech Coder Compatible with CS-ACELP (CS-ACELP와 호환성을 갖는 광대역 음성 부호화기 설계)

김동주;이인성
- The Journal of the Acoustical Society of Korea
- /
- v.19 no.4
- /
- pp.52-57
- /
- 2000
In this paper, we designed the 16 Kbps speech coder that has compatibility with CS-ACELP algorithm(G.729). The speech signal is sampled at rate of 16 KHz, divided into two narrowband signal by QMF filterbank, and decimated to rate of 8 KHz. The lower-band signal is encoded by CS-ACELP and the upper-band signal is encoded by Adaptive Transform Coding(ATC) algorithm. At the receiver, two band signals are synthesized by decoder of CS-ACELP and ATC, respectively. The reconstructed output is obtained by passing the QMF synthesis bank. The proposed wideband coder is evaluated with ITU-T G.722 coder through the Mean Opinion Score(MOS) test.
PDF

The wavelet image coder based on the embedded microprocessor (임베디드 마이크로 프로세서 기반의 웨이블릿 영상 부호화기)

Park, Sung-Wook;Kim, Young-Bong;Park, Jong-Wook
- The Transactions of the Korean Institute of Electrical Engineers P
- /
- v.51 no.4
- /
- pp.198-205
- /
- 2002
In this paper, we proposed a wavelet image coder based on the portable embedded microprocessor. The proposed coder stores the bit level information of the wavelet coefficient in the 2D significance array. Using this information, the coder make the significance check for coefficient and bit level scanning at the same pass. The proposed method has the advantage that we can reduce the scan iteratively and the memory usage for the coding process. Experimental results show that the proposed method outperforms popular image coders such as JPEG, EZW and SPIHT in based on the portable embedded system environment.
PDF KSCI

An Efficient Pitch Estimation for IMBE (Improved Multi-band Excitation) Speech Coder (개량형 다중대역 여기 (IMBE: Improved Multi-band Excitation) 음성 부호기의 피치 예측 개선)

Na, Hoon;Jeong, Dae-Gwon
- The Journal of the Acoustical Society of Korea
- /
- v.20 no.3
- /
- pp.34-41
- /
- 2001
In an IMBE (Improved Multi-band Excitation) speech coder, initial pitch estimation occupies most of the total computing time for the coder due to complex cost function and exhaustive search over candidate pitches. Future frames in initial pitch estimation cause inevitable time delay. Therefore, it is difficult to implement a real-time coder. Furthermore, unvoiced frames use the unnecessary pitch estimation as in the voiced frames. In this paper, each frame is determined voiced or unvoiced by Dyadic Wavelet Transform (DyWT) and, then, initial pitch estimation is performed only for voiced frame. Therefore different pitch estimation algorithms are employed between voiced and unvoiced frames incurring reduced time delay at transmitter and receiver. Simulation result show that the relative complexity of initial pitch estimation is reduced by 23％, and the processing time decreases down to 1/10 ∼ 1/1l of the IMBE coder while speech quality is almost maintained.
PDF

Search Result 72, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)