• 제목/요약/키워드: feature coding

Search Result 203, Processing Time 0.029 seconds

Classification of Transient Signals in Ocean Background Noise Using Bayesian Classifier (베이즈 분류기를 이용한 수중 배경소음하의 과도신호 분류)

  • Kim, Ju-Ho;Bok, Tae-Hoon;Paeng, Dong-Guk;Bae, Jin-Ho;Lee, Chong-Hyun;Kim, Seong-Il
    • Journal of Ocean Engineering and Technology
    • /
    • v.26 no.4
    • /
    • pp.57-63
    • /
    • 2012
  • In this paper, a Bayesian classifier based on PCA (principle component analysis) is proposed to classify underwater transient signals using $16^{th}$ order LPC (linear predictive coding) coefficients as feature vector. The proposed classifier is composed of two steps. The mechanical signals were separated from biological signals in the first step, and then each type of the mechanical signal was recognized in the second step. Three biological transient signals and two mechanical signals were used to conduct experiments. The classification ratios for the feature vectors of biological signals and mechanical signals were 94.75% and 97.23%, respectively, when all 16 order LPC vector were used. In order to determine the effect of underwater noise on the classification performance, underwater ambient noise was added to the test signals and the classification ratio according to SNR (signal-to-noise ratio) was compared by changing dimension of feature vector using PCA. The classification ratios of the biological and mechanical signals under ocean ambient noise at 10dB SNR, were 0.51% and 100% respectively. However, the ratios were changed to 53.07% and 83.14% when the dimension of feature vector was converted to three by applying PCA. For correct, classification, it is required SNR over 10 dB for three dimension feature vector and over 30dB SNR for seven dimension feature vector under ocean ambient noise environment.

ON IMPROVING THE PERFORMANCE OF CODED SPECTRAL PARAMETERS FOR SPEECH RECOGNITION

  • Choi, Seung-Ho;Kim, Hong-Kook;Lee, Hwang-Soo
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.08a
    • /
    • pp.250-253
    • /
    • 1998
  • In digital communicatioin networks, speech recognition systems conventionally reconstruct speech followed by extracting feature [parameters. In this paper, we consider a useful approach by incorporating speech coding parameters into the speech recognizer. Most speech coders employed in the networks represent line spectral pairs as spectral parameters. In order to improve the recognition performance of the LSP-based speech recognizer, we introduce two different ways: one is to devise weighed distance measures of LSPs and the other is to transform LSPs into a new feature set, named a pseudo-cepstrum. Experiments on speaker-independent connected-digit recognition showed that the weighted distance measures significantly improved the recognition accuracy than the unweighted one of LSPs. Especially we could obtain more improved performance by using PCEP. Compared to the conventional methods employing mel-frequency cepstral coefficients, the proposed methods achieved higher performance in recognition accuracies.

  • PDF

Content-based Image Indexing Using PCA

  • Yu, Young-Dal;Jun, Min-Gun;Kim, Daijij;Kang, Dae-Seong
    • Proceedings of the IEEK Conference
    • /
    • 2000.07b
    • /
    • pp.827-830
    • /
    • 2000
  • In this paper, we propose the method using PCA(principal component analysis) algorithm when proposed algorithm performs multimedia information indexing. After we extract DC coefficients of DCT from MPEG video stream which is an international standard of moving picture compression coding, we apply PCA algorithm to image made of DC coefficients and extract the feature of each DC image. Using extracted features, we generate codebook and perform multimedia information indexing. The proposed algorithm Is very fast when indexing and can generate optimized codebook because of using statistical feature of data

  • PDF

Classification of Fingerprint Ridge Lines Using Runlength Codes (런길이 부호화를 이용한 지문융선 분류)

  • 이정환;노석호;김윤호
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2004.05b
    • /
    • pp.468-471
    • /
    • 2004
  • In this paper, a method for classifying fingerprint ridge lines using runlength codes is proposed. To detect feature points(minutiae) in automatic fingerprint identification system(AFIS), classification of fingerprint ridge lines are essential process. The fingerprint ridge lines are classified by run-length coding, and also the end and bifurcation regions in ridge lines are separated. To evaluate the performance of the proposed method, detected feature regions including minutiae points and classified fingerprint ridge lines are shown.

  • PDF

Enhancement of MSFC-Based Multi-Scale Features Compression Network with Bottom-UP MSFF in VCM (VCM 의 바텀-업 MSFF 를 이용한 MSFC 기반 멀티-스케일 특징 압축 네트워크 개선)

  • Dong-Ha Kim;Gyu-Woong Han;Jun-Seok Cha;Jae-Gon Kim
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.11a
    • /
    • pp.116-118
    • /
    • 2022
  • MPEG-VCM(Video Coding for Machine)은 입력된 이미지/비디오의 특징(feature)를 압축하는 Track 1 과 입력 이미지/비디오를 직접 압축하는 Track 2 로 나뉘어 표준화가 진행 중이다. 본 논문은 Track 1 의 비전임무 네트워크로 사용하는 Detectron2 의 FPN(Feature Pyramid Network)에서 추출한 멀티-스케일 특징을 효율적으로 압축하는 MSFC 기반의 압축 모델의 개선 기법을 제시한다. 제안기법은 해상도를 줄여서 단일-스케일 압축맵을 압축하는 기존의 압축 모델에서 저해상도 특징맵을 고해상도 특징맵에 바텀-업(Bottom-Up) 구조로 합성하여 단일-스케일 특징맵을 구성하는 바텀-업 MSFF 를 가지는 압축 모델을 제시한다. 제안방법은 기존의 모델 보다 BPP-mAP 성능에서 1 ~ 2.7%의 개선된 BD-rate 성능을 보이며 VCM 의 이미지 앵커(image anchor) 대비 최대 -85.94%의 BD-rate 성능향상을 보인다.

  • PDF

Neural Feature Compression with Block-based Feature Resizing (블록 기반 특징맵 크기 조정을 이용한 DNN 특징맵 압축)

  • Yoon, Curie;Jeong, Hye Won;Kim, Yeongwoong;Kim, Younhee;Jeong, Se-Yoon;Kim, Hui Yong
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.06a
    • /
    • pp.1203-1206
    • /
    • 2022
  • 자율주행, IoT 등 많은 양의 영상 정보를 실시간으로 처리해야 하는 기술과 mobile device 등의 기기에서 Machine Learning 연산을 하는 소프트웨어들이 등장함에 따라 사람을 위한 영상을 출력하는 영상 부호화 기술 대신 기계의 vision task 성능을 위해 특화된 영상 부호화 기술의 필요성이 대두됐다. 본 연구에서는 영상에서 추출한 특징맵을 Neural-Net based Video Coding 모델을 이용해 압축률과 기계의 vision task 성능을 동시에 최적화한다. 또한, 하드웨어 친화적인 block-based 처리와 이로 인한 성능 저하를 최소화하기 위해 적응적 resizing 방식을 제안한다.

  • PDF

SAD-Based Reordering of Feature Map Sequence for VCM (VCM 을 위한 SAD 기반 특징맵 시퀀스 재배열)

  • Kim, Dong-Ha;Yoon, Yong-Uk;Kim, Jae-Gon
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • fall
    • /
    • pp.30-32
    • /
    • 2021
  • 최근 머신비전 임무(machine vision task)를 위해 기계에 소비되는 비디오가 증가하면서 MPEG 은 기계를 위한 비디오 부호화 표준으로 VCM(Video Coding for Machine) 표준화 진행하고 있다. VCM 은 기계분석 네트워크에 입력되는 비디오 또는 특징(feature)을 부/복호화하여 압축 대비 임무 수행 정확도를 평가한다. 본 논문은 기계분석 네트워크에서 추출한 특징 데이터를 기존의 비디오 코덱을 사용하여 부/복호화를 진행할 때, 각 채널의 특징맵을 SAD(Sum of Absolute Difference) 기반으로 재배열하는 방법을 제안한다. 제안기법은 VCM 의 기준성능(anchor)에는 미치지 못하지만, 채널 재배열하지 않은 특징을 비디오 코덱으로 부호화 할 때 보다 개선된 성능을 보인다.

  • PDF

Analysis of compression and machine task performance according to feature map resizing and interpolation (피처 맵 리사이징과 보간법에 따른 압축 및 머신태스크 성능 분석)

  • Rhee, Seong-bae;Lee, Min-Seok;Kim, Kyu-Heon
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.06a
    • /
    • pp.832-835
    • /
    • 2022
  • 최근 딥러닝 네트워크의 피처 맵을 활용하여 머신 태스크를 수행하는 Collaborative Intelligence에 대한 관심이 증가하고 있다. CI 구조는 피처 맵을 전송함에 따라서 저사양 디바이스에서 딥러닝 기반의 머신 태스크 수행을 가능하게 하여 다양한 산업에서 활용될 것으로 기대되고 있다. 그러나 CI 구조에서 전송되는 피처 맵은 데이터 크기가 방대하기 때문에 전송에 있어 효율적인 피처 맵 압축이 필요하다. 이에 본 논문에서는 MPEG-VCM에서 제안된 리사이징 (resizing)과 보간법 (interpolation)을 활용하여 피처 맵을 압축하는 Feature Coding 기술에 대하여, 다양한 리사이징 및 보간 방법을 조합하여 가장 우수한 압축 성능 대비 머신 태스크 성능을 나타내는 조합을 실험을 통해서 확인하고자 한다.

  • PDF

On a Reduction of Computation Time of FFT Cepstrum (FFT 켑스트럼의 처리시간 단축에 관한 연구)

  • Jo, Wang-Rae;Kim, Jong-Kuk;Bae, Myung-Jin
    • Speech Sciences
    • /
    • v.10 no.2
    • /
    • pp.57-64
    • /
    • 2003
  • The cepstrum coefficients are the most popular feature for speech recognition or speaker recognition. The cepstrum coefficients are also used for speech synthesis and speech coding but has major drawback of long processing time. In this paper, we proposed a new method that can reduce the processing time of FFT cepstrum analysis. We use the normal ordered inputs for FFT function and the bit-reversed inputs for IFFT function. Therefore we can omit the bit-reversing process and reduce the processing time of FFT ceptrum analysis.

  • PDF

A Study on Korean Speech Analysis using Walsh Transform (Walsh변환을 이용한 한국어 숫자음 음성분석에 관한 연구)

  • 김계현;김준현
    • The Transactions of the Korean Institute of Electrical Engineers
    • /
    • v.37 no.4
    • /
    • pp.251-256
    • /
    • 1988
  • This work describes a speech analysis of Korean number ('1'-'10') which are spoken by several speakers using Fast Walsh Transform(FWHT) method. FWHT includes only addition and subtraction operations, therefore faster and needs less memory than FFT(Fast Fourier Transfifrm) or LPC(Linear Predictive Coding) analysis method. We have investigated that FWHT method can find speaker independent feature(which represents same cue about some word independent of different speakers) The results of this experiment, the 70% of same words(korean number '2')which spoken by several speakers have had slmilar patterns.

  • PDF