• Title/Summary/Keyword: cepstral metric

Search Result 2, Processing Time 0.017 seconds

Locating the damaged storey of a building using distance measures of low-order AR models

  • Xing, Zhenhua;Mita, Akira
    • Smart Structures and Systems
    • /
    • v.6 no.9
    • /
    • pp.991-1005
    • /
    • 2010
  • The key to detecting damage to civil engineering structures is to find an effective damage indicator. The damage indicator should promptly reveal the location of the damage and accurately identify the state of the structure. We propose to use the distance measures of low-order AR models as a novel damage indicator. The AR model has been applied to parameterize dynamical responses, typically the acceleration response. The premise of this approach is that the distance between the models, fitting the dynamical responses from damaged and undamaged structures, may be correlated with the information about the damage, including its location and severity. Distance measures have been widely used in speech recognition. However, they have rarely been applied to civil engineering structures. This research attempts to improve on the distance measures that have been studied so far. The effect of varying the data length, number of parameters, and other factors was carefully studied.

RoutingConvNet: A Light-weight Speech Emotion Recognition Model Based on Bidirectional MFCC (RoutingConvNet: 양방향 MFCC 기반 경량 음성감정인식 모델)

  • Hyun Taek Lim;Soo Hyung Kim;Guee Sang Lee;Hyung Jeong Yang
    • Smart Media Journal
    • /
    • v.12 no.5
    • /
    • pp.28-35
    • /
    • 2023
  • In this study, we propose a new light-weight model RoutingConvNet with fewer parameters to improve the applicability and practicality of speech emotion recognition. To reduce the number of learnable parameters, the proposed model connects bidirectional MFCCs on a channel-by-channel basis to learn long-term emotion dependence and extract contextual features. A light-weight deep CNN is constructed for low-level feature extraction, and self-attention is used to obtain information about channel and spatial signals in speech signals. In addition, we apply dynamic routing to improve the accuracy and construct a model that is robust to feature variations. The proposed model shows parameter reduction and accuracy improvement in the overall experiments of speech emotion datasets (EMO-DB, RAVDESS, and IEMOCAP), achieving 87.86%, 83.44%, and 66.06% accuracy respectively with about 156,000 parameters. In this study, we proposed a metric to calculate the trade-off between the number of parameters and accuracy for performance evaluation against light-weight.