• Title/Summary/Keyword: 선형 예측 부호화

Search Result 52, Processing Time 0.028 seconds

Speaker Recognition using LPC cepstrum Coefficients and Neural Network (LPC 켑스트럼 계수와 신경회로망을 사용한 화자인식)

  • Choi, Jae-Seung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.12
    • /
    • pp.2521-2526
    • /
    • 2011
  • This paper proposes a speaker recognition algorithm using a perceptron neural network and LPC (Linear Predictive Coding) cepstrum coefficients. The proposed algorithm first detects the voiced sections at each frame. Then, the LPC cepstrum coefficients which have speaker characteristics are obtained by the linear predictive analysis for the detected voiced sections. To classify the obtained LPC cepstrum coefficients, a neural network is trained using the LPC cepstrum coefficients. In this experiment, the performance of the proposed algorithm was evaluated using the speech recognition rates based on the LPC cepstrum coefficients and the neural network.

A Study of BWE-Prediction-Based Split-Band Coding Scheme (BWE 예측기반 대역분할 부호화기에 대한 연구)

  • Song, Geun-Bae;Kim, Austin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.6
    • /
    • pp.309-318
    • /
    • 2008
  • In this paper, we discuss a method for efficiently coding the high-band signal in the split-band coding approach where an input signal is divided into two bands and then each band may be encoded separately. Generally, and especially through the research on the artificial bandwidth extension (BWE), it is well known that there is a correlation between the two bands to some degree. Therefore, some coding gain could be achieved by utilizing the correlation. In the BWE-prediction-based coding approach, using a simple linear BWE function may not yield optimal results because the correlation has a non-linear characteristic. In this paper, we investigate the new coding scheme more in details. A few representative BWE functions including linear and non-linear ones are investigated and compared to find a suitable one for the coding purpose. In addition, it is also discussed whether there are some additional gains in combining the BWE coder with the predictive vector quantizer which exploits the temporal correlation.

A study on motion prediction and subband coding of moving pictuers using GRNN (GRNN을 이용한 동영상 움직임 예측 및 대역분할 부호화에 관한 연구)

  • Han, Young-Oh
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.5 no.3
    • /
    • pp.256-261
    • /
    • 2010
  • In this paper, a new nonlinear predictor using general regression neural network(GRNN) is proposed for the subband coding of moving pictures. The performance of a proposed nonlinear predictor is compared with BMA(Block Match Algorithm), the most conventional motion estimation technique. As a result, the nonlinear predictor using GRNN can predict well more 2-3dB than BMA. Specially, because of having a clustering process and smoothing noise signals, this predictor well preserves edges in frames after predicting the subband signal. This result is important with respect of human visual system and is excellent performance for the subband coding of moving pictures.

Design of a Lossless Audio Coding Using Cholesky Decomposition and Golomb-Rice Coding (콜레스키 분해와 골롬-라이스 부호화를 이용한 무손실 오디오 부호화기 설계)

  • Cheong, Cheon-Dae;Shin, Jae-Ho
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.11
    • /
    • pp.1480-1490
    • /
    • 2008
  • Design of a linear predictor and matching of an entropy coder is the art of lossless audio coding. In this paper, we use the covariance method and the Choleskey decomposition for calculating linear prediction coefficients instead of the autocorreation method and the Levinson-Durbin recursion. These results are compared to the polynomial predictor. Both of them, the predictor which has small prediction error is selected. For the entropy coding, we use the Golomb-Rice coder using the block-based parameter estimation method and the sequential adaptation method with LOCO-land RLGR. The proposed predictor and the block-based parameter estimation have $2.2879%{\sim}0.3413%$ improved compression ratios compared to FLAC lossless audio coder which use the autocorrelation method and the Levinson-Durbin recursion. The proposed predictor and the LOCO-I adaptation method could improved by $2.2879%{\sim}0.3413%$. But the proposed predictor and the RLGR adaptation method got better results with specific signals.

  • PDF

Adaptive Multi-view Video Interpolation Method Based on Inter-view Nonlinear Moving Blocks Estimation (시점 간 비선형 움직임 블록 예측에 기초한 적응적 다시점 비디오 보상 보간 기법)

  • Kim, Jin-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.4
    • /
    • pp.9-18
    • /
    • 2014
  • Recently, many researches have been focused on multi-view video applications and services such as wireless video surveillance networks, wireless video sensor networks and wireless mobile video. In multi-view video signal processing, to exploit the strong correlation between images acquired by different cameras plays great role in developing a core technique of multi-view video coding. This paper proposes an adaptive multi-view video interpolation technique which is applicable for multi-view distributed video coding without requiring any cooperation amongst the cameras. The proposed algorithm estimates the non-linear moving blocks and employs disparity compensated view prediction, and then fills in the unreliable blocks. Through computer simulations, it is shown that the proposed method outperforms the conventional methods.

Asymmetric Motion Vector-Based Side Information Generation for Efficient Distributed Video Coding (효과적인 분산 비디오 부호화를 위한 비대칭성 움직임 벡터 기반 보조정보 생성 방법)

  • Na, Taeyoung;Kim, Munchurl
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2010.11a
    • /
    • pp.129-131
    • /
    • 2010
  • 분산 비디오 부호화(distributed video coding)는 분산 소스 부호화의 대표적인 응용분야로서 부호화 복잡도가 부호화기에서 복호화기로 이동되어 저전력 부호화 환경에 매우 적합하다. 본 논문에서는 분산 비디오 부호화의 성능 향상에 있어 가장 중요한 보조 정보의 효과적인 생성 방법을 제안한다. 우선 보조 정보 생성을 위한 키 프레임들 간의 블록 움직임 추정에 있어 기존 방법들이 대체적으로 가정하고 있는 선형적인 움직임 이동에 따른 잘못된 예측을 해결하기 위해 두 장 이상의 키 프레임을 사용하여 블록 움직임을 추정한 후, 선형 회귀(linear regression)를 이용하여 보조 정보 상의 블록 움직임 궤적을 추정한다. 이때 움직임 추정을 위한 키 프레임 번호를 증가하며 선입선출(FIFO)형 버퍼에 저장 및 삭제하여 동일한 보조정보에 해당하는 여러 움직임 벡터 필드와, 기존의 선형적인 움직임이 가정된 움직임 벡터 필드를 동시에 생성한다. 다음으로 보간(interpolation)하려는 보조 정보 프레임 내의 임의의 블록에 가장 가깝게 통과하는 움직임 벡터 필드를 선택하여 해당하는 블록의 최종 움직임 벡터로 선택한다. 실험결과 제안하는 보조 정보 생성 방법은 기존의 방법과 비교했을 때 비대칭성 움직임 벡터 사용만으로 평균 PSNR이 0.216dB 만큼 증가하는 것을 확인할 수 있었다.

  • PDF

An Efficient Predictive-SBR Implementation (효율적인 예측 SBR 구현)

  • Heo, So-Young;Kim, Rin-Chul
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2008.02a
    • /
    • pp.109-112
    • /
    • 2008
  • 본 논문에서는 MPEG-4 HE-AAC의 SBR 부호기의 효율을 개선하기 위해 예측 SBR(Predictive-SBR)을 제안한다. SBR 부호기는 주부호기(core encoder)와 결합하여 적은 비트량으로 고주파 성분을 복원할 수 있게 한다. 본 논문에서는 SBR 데이터의 약 70%를 차지하는 포락선 정보를 부호화하는 방법을 개선하여 효율성을 높이고자 한다. 기존 SBR은 포락선 정보의 전송을 위해 다음과 같은 방법을 이용한다. 먼저 고주파 대역의 에너지를 스케일팩터 밴드 단위로 계산한다. 다음으로, 전송정보량의 감소를 위해 델타 코딩 방식을 이용하여 에너지 정보를 부호화한다. 본 논문에서는 SBR의 포락선 정보를 효과적으로 감축하기 위하여 고주파 대역의 에너지를 예측하는 방법을 이용한다. SBR 부호기의 입력 데이터가 SBR 복호기의 입력데이터와 동일하다는 가정 하에 선형 회귀(linear-regression) 기법을 이용하여 고주파 대역의 에너지를 추정한다. 그 후에 추정된 에너지와 원래의 고주파 대역 에너지의 오차를 델타 코딩을 이용하여 부호화한다. 정보를 전송할 때는 고주파 대역 에너지의 델타 코드와 예측 SBR에서 계산한 오차의 델타 코드 중 부호화에 필요한 비트량이 적은 방식을 선택하여 부호화하도록 한다. 그 결과 약 10% 정도의 정보량 감축 효과를 얻을 수 있다.

  • PDF

Medical Image Compression in the Wavelet Transform Domain (Wavelet 변환 영역에서 의료영상압축)

  • 이상복;신승수
    • The Journal of the Korea Contents Association
    • /
    • v.2 no.4
    • /
    • pp.23-29
    • /
    • 2002
  • This paper suggest the image compression that is needed to process PACS in medical information system. The image decoding method is used Linear-predictor and Lloyd-Max quantizer(quantization) in the Wavelet transform domain. Wavelet Transform Method is processed the multi-resolution by dividing image into 10 sub-bands of 3 levels. Low frequency domain that is sensitive to human visual characteristic is encoded by DPCM which is lossless encoding methods, and Lloyed-Max quantizer, the optimal quantizer for reducing ringing and aliasing in the image of inter sub-band, is used in the remaining high frequency domain of sub-band. The examination verifies that decompressed images are superior by the result that PSNR is 28.53dB on the input image, 512$\times$152 abdominal CT image and Chest image.

  • PDF

Three-dimensional Texture Coordinate Coding Using Texture Image Rearrangement (텍스처 영상 재배열을 이용한 삼차원 텍스처 좌표 부호화)

  • Kim, Sung-Yeol;Ho, Yo-Sung
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.43 no.6 s.312
    • /
    • pp.36-45
    • /
    • 2006
  • Three-dimensional (3-D) texture coordinates mean the position information of torture segments that are mapped into polygons in a 3-D mesh model. In order to compress texture coordinates, previous works reused the same linear predictor that had already been employed to code geometry data. However, the previous approaches could not carry out linear prediction efficiently since texture coordinates were discontinuous along a coding order. Especially, discontinuities of texture coordinates became more serious in the 3-D mesh model including a non-atlas texture. In this paper, we propose a new scheme to code 3-D texture coordinates using as a texture image rearrangement. The proposed coding scheme first extracts texture segments from a texture. Then, we rearrange the texture segments consecutively along the coding order, and apply a linear prediction to compress texture coordinates. Since the proposed scheme minimizes discontinuities of texture coordinates, we can improve coding efficiency of texture coordinates. Experiment results show that the proposed scheme outperforms the MPEG-4 3DMC standard in terms of coding efficiency.

Efficient Fast Multiple Reference Frame Selection Technique for H.264/AVC (H.264/AVC에서의 효율적인 고속 다중 참조 프레임 선택 기법)

  • Lee, Hyun-Woo;Ryu, Jong-Min;Jeong, Je-Chang
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.10C
    • /
    • pp.820-828
    • /
    • 2008
  • In order to achieve high coding efficiency, H.264/AVC video coding standard adopts the techniques such as variable block size coding, motion estimation with quarter-pel precision, multiple reference frames, rate-distortion optimization, and etc. However, these coding methods have a defect to greatly increase the complexity for motion estimation. Particularly, from multiple reference frame motion estimation, the computational burden increases in proportion to the number of the searched reference frames. Therefore, we propose the method to reduce the complexity by controlling the number of the searched reference frames in motion estimation. Proposed algorithm uses the optimal reference frame information in both $P16{\times}16$ mode and the adjacent blocks, thus omits unnecessary searching process in the rest of inter modes. Experimental results show the proposed method can save an average of 57.31% of the coding time with negligible quality and bit-rate difference. This method also can be adopted with any of the existing motion estimation algorithm. Therefore, additional performance improvement can be obtained.