• Title/Summary/Keyword: 특징 파라미터 추출

Search Result 225, Processing Time 0.027 seconds

Gaussian Mixture Model using Minimum Classification Error for Environmental Sounds Recognition Performance Improvement (Minimum Classification Error 방법 도입을 통한 Gaussian Mixture Model 환경음 인식성능 향상)

  • Han, Da-Jeong;Park, Aa-Ron;Park, Jun-Qyu;Baek, Sung-June
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.12
    • /
    • pp.497-503
    • /
    • 2011
  • In this paper, we proposed the MCE as a GMM training method to improve the performance of environmental sounds recognition. We model the environmental sounds data with newly defined misclassification function using the log likelihood of the corresponding class and the log likelihood of the rest classes for discriminative training. The model parameters are estimated with the loss function using GPD(generalized probabilistic descent). For recognition performance comparison, we extracted the 12 degrees features using preprocessing and MFCC(mel-frequency cepstral coefficients) of the 9 kinds of environmental sounds and carry out GMM classification experiments. According to the experimental results, MCE training method showed the best performance by an average of 87.06% with 19 mixtures. This result confirmed us that MCE training method could be effectively used as a GMM training method in environmental sounds recognition.

BSR (Buzz, Squeak, Rattle) noise classification based on convolutional neural network with short-time Fourier transform noise-map (Short-time Fourier transform 소음맵을 이용한 컨볼루션 기반 BSR (Buzz, Squeak, Rattle) 소음 분류)

  • Bu, Seok-Jun;Moon, Se-Min;Cho, Sung-Bae
    • The Journal of the Acoustical Society of Korea
    • /
    • v.37 no.4
    • /
    • pp.256-261
    • /
    • 2018
  • There are three types of noise generated inside the vehicle: BSR (Buzz, Squeak, Rattle). In this paper, we propose a classifier that automatically classifies automotive BSR noise by using features extracted from deep convolutional neural networks. In the preprocessing process, the features of above three noises are represented as noise-map using STFT (Short-time Fourier Transform) algorithm. In order to cope with the problem that the position of the actual noise is unknown in the part of the generated noise map, the noise map is divided using the sliding window method. In this paper, internal parameter of the deep convolutional neural networks is visualized using the t-SNE (t-Stochastic Neighbor Embedding) algorithm, and the misclassified data is analyzed in a qualitative way. In order to analyze the classified data, the similarity of the noise type was quantified by SSIM (Structural Similarity Index) value, and it was found that the retractor tremble sound is most similar to the normal travel sound. The classifier of the proposed method compared with other classifiers of machine learning method recorded the highest classification accuracy (99.15 %).

Apply Locally Weight Parameter Elimination for CNN Model Compression (지역적 가중치 파라미터 제거를 적용한 CNN 모델 압축)

  • Lim, Su-chang;Kim, Do-yeon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.9
    • /
    • pp.1165-1171
    • /
    • 2018
  • CNN requires a large amount of computation and memory in the process of extracting the feature of the object. Also, It is trained from the network that the user has configured, and because the structure of the network is fixed, it can not be modified during training and it is also difficult to use it in a mobile device with low computing power. To solve these problems, we apply a pruning method to the pre-trained weight file to reduce computation and memory requirements. This method consists of three steps. First, all the weights of the pre-trained network file are retrieved for each layer. Second, take an absolute value for the weight of each layer and obtain the average. After setting the average to a threshold, remove the weight below the threshold. Finally, the network file applied the pruning method is re-trained. We experimented with LeNet-5 and AlexNet, achieved 31x on LeNet-5 and 12x on AlexNet.

Development of a low-cost fruite classification system based on Digital images (디지털영상 기반 저비용 선과시스템 개발)

  • Koo, Min-Jeong;Hwang, Dong-Kuk;Lee, Woo-Ram;Kim, Jae-Hong;Seo, Jeong-Man
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.6
    • /
    • pp.155-162
    • /
    • 2008
  • The quality of the fruits is measured by a lot of parameters. The grader of the fruits to measure the size of them is using the rotation drum method. Therefore when we classify the size of the fruits, they will be damaged. Also the optical grader used for estimating the degree of the saccharinity will incur high cost for it. In the proposed system, to select the characteristics of the fruits, three cameras are used. Because the information such as the volume and the degree of the maturity is used to classify the fruits, the degree of the saccharinity can't be estimated itself, but the information such as the color and the damage of the fruits can be estimated. Therefore, because we don't need the digital image with high resolution, we can develop the grader system of the fruit with low cost. To evaluate the performance of the proposed system, we compared it with the sight estimation and then we classified the sample. The result shows the accuracy of 96.7%.

  • PDF

Geometric Region Reconstruction of Steel-tube Computed Radiography Using Nonlinear Structural Analysis (비선형 구도해석에 의한 강관 CR영상의 기하학적 영역복원)

  • Hwang, Jae-Ho
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.46 no.6
    • /
    • pp.146-152
    • /
    • 2009
  • The steel-tube is exposed to a radiation from X-ray source. The transmitted radiation is detected by a detector, usually film or more recently an imaging plate(IP) of Computed Radiography(CR). The detected radiation overlaps the region of both sides of the object. The radiographic images reflect the projections of the rays, passing twice through both external and internal tube material. Nonlinear distortion due to the radioactive transmission and geometric disposition also appears on images. In this paper, an analytical approach is presented to achieve image reconstruction from the steel-tube CR images. Parameters related to radiation and measuring structure, such as intensities, absorption in material and geometric specifications linked with the collimating components, are calculated and identified in order to construct the renoval images for twofold regions of circle-type steel tubes. A correction procedure for region recovery most similar to the true tube is designed. The application of this approach on CR images is shown and reconstructed results are discussed.

Speaker Recognition Performance Improvement by Voiced/Unvoiced Classification and Heterogeneous Feature Combination (유/무성음 구분 및 이종적 특징 파라미터 결합을 이용한 화자인식 성능 개선)

  • Kang, Jihoon;Jeong, Sangbae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.6
    • /
    • pp.1294-1301
    • /
    • 2014
  • In this paper, separate probabilistic distribution models for voiced and unvoiced speech are estimated and utilized to improve speaker recognition performance. Also, in addition to the conventional mel-frequency cepstral coefficient, skewness, kurtosis, and harmonic-to-noise ratio are extracted and used for voiced speech intervals. Two kinds of scores for voiced and unvoiced speech are linearly fused with the optimal weight found by exhaustive search. The performance of the proposed speaker recognizer is compared with that of the conventional recognizer which uses mel-frequency cepstral coefficient and a unified probabilistic distribution function based on the Gassian mixture model. Experimental results show that the lower the number of Gaussian mixture, the greater the performance improvement by the proposed algorithm.

Real-time Simulation of Seas and Swells for Ship Maneuvering Simulators (선박운항 시뮬레이터를 위한 풍파와 너울의 실시간 시뮬레이션)

  • Park, Sekil;Oh, Jaeyong;Park, Jinah
    • Journal of KIISE
    • /
    • v.42 no.7
    • /
    • pp.846-851
    • /
    • 2015
  • Seas and swells are basic wave types in ocean surface simulation and are very important elements in the simulation of ocean background. In this paper, we propose a real-time simulation method, for reproducing realistic seas and swells, to be used in real-time simulators such as ship maneuvering simulators. Seas and swells have different visual properties. Swells have relatively longer wavelengths and round crests compared with seas, therefore they are visualized globally with large meshes and procedural methods. Parameters to illustrate swells are extracted from ocean wave spectra. Conversely, seas have shorter wavelengths and their characteristics are only clearly apparent near to the observation point. Here, we present visualization of seas based on a statistical wave model using ocean wave spectra, which provides realistic results in a reactively small area.

Resource-Efficient Object Detector for Low-Power Devices (저전력 장치를 위한 자원 효율적 객체 검출기)

  • Akshay Kumar Sharma;Kyung Ki Kim
    • Transactions on Semiconductor Engineering
    • /
    • v.2 no.1
    • /
    • pp.17-20
    • /
    • 2024
  • This paper presents a novel lightweight object detection model tailored for low-powered edge devices, addressing the limitations of traditional resource-intensive computer vision models. Our proposed detector, inspired by the Single Shot Detector (SSD), employs a compact yet robust network design. Crucially, it integrates an 'enhancer block' that significantly boosts its efficiency in detecting smaller objects. The model comprises two primary components: the Light_Block for efficient feature extraction using Depth-wise and Pointwise Convolution layers, and the Enhancer_Block for enhanced detection of tiny objects. Trained from scratch on the Udacity Annotated Dataset with image dimensions of 300x480, our model eschews the need for pre-trained classification weights. Weighing only 5.5MB with approximately 0.43M parameters, our detector achieved a mean average precision (mAP) of 27.7% and processed at 140 FPS, outperforming conventional models in both precision and efficiency. This research underscores the potential of lightweight designs in advancing object detection for edge devices without compromising accuracy.

Lip and Voice Synchronization Using Visual Attention (시각적 어텐션을 활용한 입술과 목소리의 동기화 연구)

  • Dongryun Yoon;Hyeonjoong Cho
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.4
    • /
    • pp.166-173
    • /
    • 2024
  • This study explores lip-sync detection, focusing on the synchronization between lip movements and voices in videos. Typically, lip-sync detection techniques involve cropping the facial area of a given video, utilizing the lower half of the cropped box as input for the visual encoder to extract visual features. To enhance the emphasis on the articulatory region of lips for more accurate lip-sync detection, we propose utilizing a pre-trained visual attention-based encoder. The Visual Transformer Pooling (VTP) module is employed as the visual encoder, originally designed for the lip-reading task, predicting the script based solely on visual information without audio. Our experimental results demonstrate that, despite having fewer learning parameters, our proposed method outperforms the latest model, VocaList, on the LRS2 dataset, achieving a lip-sync detection accuracy of 94.5% based on five context frames. Moreover, our approach exhibits an approximately 8% superiority over VocaList in lip-sync detection accuracy, even on an untrained dataset, Acappella.

A Study on the Pattern Recognition based Distance Protective Relaying Scheme in Power System (전력계통의 패턴인식형 거리계전기법에 관한 연구)

  • 이복구;윤석무;박철원;신명철
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.8 no.2
    • /
    • pp.9-20
    • /
    • 1998
  • In this paper, a new distance relaying scheme is proposed. Artificial neural networks are applied to the distance relaying system composed of pattern recognition based. The proposed distance relaying scheme has two blocks of pattern recognition stages to estimate the fundamental frequency and to classify the fault types. In the first block, a filtering method using neural networks called a neural networks mapping filter(NMF) is presented to efficiently extract the features. And in the sec'ond block, the estimator called neural networks fault pattern estimator(NFPE) is also presented to classify the fault types by the extracted effective features obtained from NMF. Each block of these applied schemes is trained by back-propagation algorithm of multilayer perceptron and show the fast and accurate pattern recognition by ability of multilayer neural networks. The test result of this approach are obtained the good performance from the fault transient wave signals of EMTP(e1ectromagnetic transients program) in the various fault conditions of power systems.

  • PDF