• 제목/요약/키워드: Feature Model

검색결과 3,339건 처리시간 0.029초

목소리 특성과 음성 특징 파라미터의 상관관계와 SVM을 이용한 특성 분류 모델링 (Correlation analysis of voice characteristics and speech feature parameters, and classification modeling using SVM algorithm)

  • 박태성;권철홍
    • 말소리와 음성과학
    • /
    • 제9권4호
    • /
    • pp.91-97
    • /
    • 2017
  • This study categorizes several voice characteristics by subjective listening assessment, and investigates correlation between voice characteristics and speech feature parameters. A model was developed to classify voice characteristics into the defined categories using SVM algorithm. To do this, we extracted various speech feature parameters from speech database for men in their 20s, and derived statistically significant parameters correlated with voice characteristics through ANOVA analysis. Then, these derived parameters were applied to the proposed SVM model. The experimental results showed that it is possible to obtain some speech feature parameters significantly correlated with the voice characteristics, and that the proposed model achieves the classification accuracies of 88.5% on average.

컨볼루션 신경망의 특징맵을 사용한 객체 추적 (Object Tracking using Feature Map from Convolutional Neural Network)

  • 임수창;김도연
    • 한국멀티미디어학회논문지
    • /
    • 제20권2호
    • /
    • pp.126-133
    • /
    • 2017
  • The conventional hand-crafted features used to track objects have limitations in object representation. Convolutional neural networks, which show good performance results in various areas of computer vision, are emerging as new ways to break through the limitations of feature extraction. CNN extracts the features of the image through layers of multiple layers, and learns the kernel used for feature extraction by itself. In this paper, we use the feature map extracted from the convolution layer of the convolution neural network to create an outline model of the object and use it for tracking. We propose a method to adaptively update the outline model to cope with various environment change factors affecting the tracking performance. The proposed algorithm evaluated the validity test based on the 11 environmental change attributes of the CVPR2013 tracking benchmark and showed excellent results in six attributes.

피처지향 분석모델을 적용한 VOD 서비스 개발을 위한 기반연구 (An Underlying Research for Developing VOD Service using Feature-Oriented Analysis Model)

  • 고광일
    • 한국산학기술학회논문지
    • /
    • 제18권7호
    • /
    • pp.26-32
    • /
    • 2017
  • VOD 서비스는 전자프로그램가이드와 더불어 가장 성공한 데이터방송 서비스의 사례로 손꼽히고 있다. 특히, VOD 서비스는 기존 방송사의 수익모델 (가입자 기반 수신료, 광고료) 외에 추가 수익을 제공하기 때문에 각 방송사들은 고유의 VOD 서비스를 개발하고 매출 향상을 위해서 빈번한 개선 작업을 수행하고 있다. 이는 곧 새로운 VOD 서비스 개발로 이어지기 때문에 개발업체는 빈번한 개발 요구에 효과적으로 대응할 방법을 고민하고 있다. 이와 같은 배경 속에서 본 연구는 다수의 사례연구를 통해 그 효율성이 입증된 피처지향 분석모델을 VOD 서비스 개발에 적용하기 위한 기반연구를 수행하였다. 본 연구에서 사용한 피처지향 분석모델은 카네기멜론대학 SEI에서 개발한 FODA (Feature-Oriented Domain Analysis)로서 FODA는 특정 도메인에 속한 소프트웨어의 피처모델을 개발하고 그 피처모델을 기반으로 고객과 함께 소프트웨어의 형상을 결정하는 도구를 제공한다. 본 연구는 VOD 서비스의 피처모델을 개발하고 그 피처모델과 정합된 VOD 서비스의 기능과 테스트케이스를 개발하여 FODA의 활용 범위를 확장하였다. 또한, 피처지향 분석모델로 생성된 피처모델, 기능명세, 테스트 케이스를 활용할 때 가능한 VOD 개발 프로세스도 제안하였다.

Emotion recognition from speech using Gammatone auditory filterbank

  • 레바부이;이영구;이승룡
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2011년도 한국컴퓨터종합학술대회논문집 Vol.38 No.1(A)
    • /
    • pp.255-258
    • /
    • 2011
  • An application of Gammatone auditory filterbank for emotion recognition from speech is described in this paper. Gammatone filterbank is a bank of Gammatone filters which are used as a preprocessing stage before applying feature extraction methods to get the most relevant features for emotion recognition from speech. In the feature extraction step, the energy value of output signal of each filter is computed and combined with other of all filters to produce a feature vector for the learning step. A feature vector is estimated in a short time period of input speech signal to take the advantage of dependence on time domain. Finally, in the learning step, Hidden Markov Model (HMM) is used to create a model for each emotion class and recognize a particular input emotional speech. In the experiment, feature extraction based on Gammatone filterbank (GTF) shows the better outcomes in comparison with features based on Mel-Frequency Cepstral Coefficient (MFCC) which is a well-known feature extraction for speech recognition as well as emotion recognition from speech.

Gabor 웨이브렛과 FCM 군집화 알고리즘에 기반한 동적 연결모형에 의한 얼굴표정에서 특징점 추출 (Feature-Point Extraction by Dynamic Linking Model bas Wavelets and Fuzzy C-Means Clustering Algorithm)

  • 신영숙
    • 인지과학
    • /
    • 제14권1호
    • /
    • pp.11-16
    • /
    • 2003
  • 본 논문은 Gabor 웨이브렛 변환을 이용하여 무표정을 포함한 표정영상에서 얼굴의 주요 요소들의 경계선을 추출한 후, FCM 군집화 알고리즘을 적용하여 무표정 영상에서 저차원의 대표적인 특징점을 추출한다. 무표정 영상의 특징점들은 표정영상의 특징점들을 추출하기 위한 템플릿으로 사용되어지며, 표정영상의 특징점 추출은 무표정 영상의 특징점과 동적 연결모형을 이용하여 개략적인 정합과 정밀한 정합 과정의 두단계로 이루어진다. 본 논문에서는 Gabor 웨이브렛과 FCM 군집화 알고리즘을 기반으로 동적 연결모형을 이용하여 표정영상에서 특징점들을 자동으로 추출할 수 있음을 제시한다. 본 연구결과는 자동 특징추출을 이용한 차원모형기반 얼굴 표정인식[1]에서 얼굴표정의 특징점을 자동으로 추출하는 데 적용되었다.

  • PDF

Rank-weighted reconstruction feature for a robust deep neural network-based acoustic model

  • Chung, Hoon;Park, Jeon Gue;Jung, Ho-Young
    • ETRI Journal
    • /
    • 제41권2호
    • /
    • pp.235-241
    • /
    • 2019
  • In this paper, we propose a rank-weighted reconstruction feature to improve the robustness of a feed-forward deep neural network (FFDNN)-based acoustic model. In the FFDNN-based acoustic model, an input feature is constructed by vectorizing a submatrix that is created by slicing the feature vectors of frames within a context window. In this type of feature construction, the appropriate context window size is important because it determines the amount of trivial or discriminative information, such as redundancy, or temporal context of the input features. However, we ascertained whether a single parameter is sufficiently able to control the quantity of information. Therefore, we investigated the input feature construction from the perspectives of rank and nullity, and proposed a rank-weighted reconstruction feature herein, that allows for the retention of speech information components and the reduction in trivial components. The proposed method was evaluated in the TIMIT phone recognition and Wall Street Journal (WSJ) domains. The proposed method reduced the phone error rate of the TIMIT domain from 18.4% to 18.0%, and the word error rate of the WSJ domain from 4.70% to 4.43%.

3D 모델 해싱의 미분 엔트로피 기반 보안성 분석 (Security Analysis based on Differential Entropy m 3D Model Hashing)

  • 이석환;권기룡
    • 한국통신학회논문지
    • /
    • 제35권12C호
    • /
    • pp.995-1003
    • /
    • 2010
  • 영상, 동영상 및 3D 모델의 인증 및 복사방지를 위한 콘텐츠 기반 해쉬 함수는 강인성 및 보안성의 성질을 만족하여야 한다. 이들 중 해쉬의 보안성을 분석하기 위한 방법으로 미분 엔트로피 방법이 제시되었으나, 이는 영상 해쉬 추출에서만 적용되었다. 따라서 본 논문에서는 미분 엔트로피 기반의 3D 모델 해쉬 특징 추출의 보안성을 분석하기 위한 모델링을 제안한다. 제안한 보안성 분석 모델링에서는 3D 모델 해싱 기법 중 가장 일반적인 두 가지 형태의 특정 추출 방법을 제시한 다음, 이들 방법들을 미분 엔트로피 기반으로 보안성을 분석하였다. 과로부터 해쉬 추출 방법에 대한 보안성을 분석하고 보안성과 강인성과의 상호보완관계에 대하여 논하였다.

역공학에서 Z-map을 이용한 특징형상 탐색 및 영역화 (Feature Recognition and Segmentation via Z-map in Reverse Engineering)

  • 김재현;신양호;박정환;고태조;유우식
    • 한국정밀공학회지
    • /
    • 제20권2호
    • /
    • pp.176-183
    • /
    • 2003
  • The paper presents a feature recognition and segmentation method for surface approximation in reverse engineering. Efficient digitizing plays an important role in constructing a computational surface model from a physical part-surface without its CAD model on hand. Depending on its measuring source (e.g., touch probe or structured light), each digitizing method has its own strengths and weaknesses in terms of speed and accuracy. The final goal of the research focuses on an integration of two different digitizing methods: measuring by the structured light and that by the touch probe. Gathering bulk of digitized points (j.e., cloud-of-points) by use of a laser scanning system, we construct a coarse surface model directly from the cloud-of-points, followed by the segmentation process where we utilize the z-map filleting & differencing to trace out feature boundary curves. The feature boundary curves and the approximate surface model could be inputs to further digitizing by a scanning touch probe. Finally, more accurate measuring points within the boundary curves can be obtained to construct a finer surface model.

Speech emotion recognition based on genetic algorithm-decision tree fusion of deep and acoustic features

  • Sun, Linhui;Li, Qiu;Fu, Sheng;Li, Pingan
    • ETRI Journal
    • /
    • 제44권3호
    • /
    • pp.462-475
    • /
    • 2022
  • Although researchers have proposed numerous techniques for speech emotion recognition, its performance remains unsatisfactory in many application scenarios. In this study, we propose a speech emotion recognition model based on a genetic algorithm (GA)-decision tree (DT) fusion of deep and acoustic features. To more comprehensively express speech emotional information, first, frame-level deep and acoustic features are extracted from a speech signal. Next, five kinds of statistic variables of these features are calculated to obtain utterance-level features. The Fisher feature selection criterion is employed to select high-performance features, removing redundant information. In the feature fusion stage, the GA is is used to adaptively search for the best feature fusion weight. Finally, using the fused feature, the proposed speech emotion recognition model based on a DT support vector machine model is realized. Experimental results on the Berlin speech emotion database and the Chinese emotion speech database indicate that the proposed model outperforms an average weight fusion method.

MSFM: Multi-view Semantic Feature Fusion Model for Chinese Named Entity Recognition

  • Liu, Jingxin;Cheng, Jieren;Peng, Xin;Zhao, Zeli;Tang, Xiangyan;Sheng, Victor S.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권6호
    • /
    • pp.1833-1848
    • /
    • 2022
  • Named entity recognition (NER) is an important basic task in the field of Natural Language Processing (NLP). Recently deep learning approaches by extracting word segmentation or character features have been proved to be effective for Chinese Named Entity Recognition (CNER). However, since this method of extracting features only focuses on extracting some of the features, it lacks textual information mining from multiple perspectives and dimensions, resulting in the model not being able to fully capture semantic features. To tackle this problem, we propose a novel Multi-view Semantic Feature Fusion Model (MSFM). The proposed model mainly consists of two core components, that is, Multi-view Semantic Feature Fusion Embedding Module (MFEM) and Multi-head Self-Attention Mechanism Module (MSAM). Specifically, the MFEM extracts character features, word boundary features, radical features, and pinyin features of Chinese characters. The acquired font shape, font sound, and font meaning features are fused to enhance the semantic information of Chinese characters with different granularities. Moreover, the MSAM is used to capture the dependencies between characters in a multi-dimensional subspace to better understand the semantic features of the context. Extensive experimental results on four benchmark datasets show that our method improves the overall performance of the CNER model.