• Title/Summary/Keyword: tonal features

Search Result 18, Processing Time 0.032 seconds

On the Importance of Tonal Features for Speech Emotion Recognition (음성 감정인식에서의 톤 정보의 중요성 연구)

  • Lee, Jung-In;Kang, Hong-Goo
    • Journal of Broadcast Engineering
    • /
    • v.18 no.5
    • /
    • pp.713-721
    • /
    • 2013
  • This paper describes an efficiency of chroma based tonal features for speech emotion recognition. As the tonality caused by major or minor keys affects to the perception of musical mood, so the speech tonality affects the perception of the emotional states of spoken utterances. In order to justify this assertion with respect to tonality and emotion, subjective hearing tests are carried out by using synthesized signals generated from chroma features, and consequently show that the tonality contributes especially to the perception of the negative emotion such as anger and sad. In automatic emotion recognition tests, the modified chroma-based tonal features are shown to produce noticeable improvement of accuracy when they are supplemented to the conventional log-frequency power coefficient (LFPC)-based spectral features.

Image-based Extraction of Histogram Index for Concrete Crack Analysis

  • Kim, Bubryur;Lee, Dong-Eun
    • International conference on construction engineering and project management
    • /
    • 2022.06a
    • /
    • pp.912-919
    • /
    • 2022
  • The study is an image-based assessment that uses image processing techniques to determine the condition of concrete with surface cracks. The preparations of the dataset include resizing and image filtering to ensure statistical homogeneity and noise reduction. The image dataset is then segmented, making it more suited for extracting important features and easier to evaluate. The image is transformed into grayscale which removes the hue and saturation but retains the luminance. To create a clean edge map, the edge detection process is utilized to extract the major edge features of the image. The Otsu method is used to minimize intraclass variation between black and white pixels. Additionally, the median filter was employed to reduce noise while keeping the borders of the image. Image processing techniques are used to enhance the significant features of the concrete image, especially the defects. In this study, the tonal zones of the histogram and its properties are used to analyze the condition of the concrete. By examining the histogram, the viewer will be able to determine the information on the image through the number of pixels associated and each tonal characteristic on a graph. The features of the five tonal zones of the histogram which implies the qualities of the concrete image may be evaluated based on the quality of the contrast, brightness, highlights, shadow spikes, or the condition of the shadow region that corresponds to the foreground.

  • PDF

A Neural Network Based Korean Segmental Duration Modeling Using Tonal Information of Phonemes (음소별 성조 정보를 이용한 신경망 기반의 한국어 음소 지속시간 모델링)

  • 김은경;이상호;오영환
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.6
    • /
    • pp.84-88
    • /
    • 1999
  • The accurate estimation of segmental duration is crucial for natural-sounding text-to-speech synthesis. For predicting Korean segmental durations, conventional methods utilized phonemic context, part-of-speech context and locational information in prosodic phrase. In this paper, the tonal information of phonemes is employed for more accurate prediction. After defining two non-boundary tones and six boundary tones, we annotated the tonal label on each syllable of 400 sentences. To predict segmental duration using tonal information, we constructed neural networks with a real-valued output node predicting phonemic duration and trained them by backpropagation algorithm. Experimental results showed that the proposed features are effective for predicting Korean segmental durations, and we got 0.863 correlation coefficient of the observed durations and predicted ones.

  • PDF

Classification and Tracking of Unknown Multiple Underwater Moving Objects Using Neural Networks (신경망에 의한 미지의 다중 수중 이동물체의 판별 및 추적)

  • 하석운
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.3 no.2
    • /
    • pp.389-396
    • /
    • 1999
  • In this paper, we propose a multiple underwater object classification and tracking algorithm using the narrowband tonal and frequency line features extracted from the frequency spectrum of the acoustic signal. The general algorithm using the wideband and narrowband energy has a high tracking error when objects are close and cross each other. But the proposed algorithm shows a good tracking performance for the simulation scenarios generated by the real acoustic data.

  • PDF

A Study on Auditory Perception Characteristics of Directional Tonal Noise (방향성을 가진 회전체 소음의 청각계 인지 특성에 관한 연구)

  • Seo, Kang-Won;Kim, Eui-Youl;Kim, Sung-Ki
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2012.04a
    • /
    • pp.348-353
    • /
    • 2012
  • This paper presents the HRTF based experimental approach to figure out why the human auditory perception on the interior noise source including the directional tonal components does not well match with the dominant features extracted from recorded acoustic signals in terms of psycho-acoustics. Since the general objective evaluation models for tonalness among various sound attributes are a function of width, frequency, excessive level of tonal components respectively, the directional tonal components cannot be properly evaluated without considering the effects of head-related transfer function on the binaural auditory perception. Thus, the directivity of source is additionally considered to prevent the erroneous conclusions from the same sound source in the process of source identification. The signal synthesis technique is used to solve a little difficulty in measuring all of desired acoustic signals for jury evaluation. The sound attributes of synthetic acoustics signals are analyzed to roughly predict the results of jury evaluation in advance by using sound quality factors such as loudness, sharpness, roughness, fluctuation strength and tonality. The jury evaluation is carefully conducted based on the recommended guideline suggested by N. Ottoet al. Each sound is respectively evaluated by selecting a value between -2 and 2 in intervals of 0.2 point. Through above procedure, based on the results of jury evaluation, it is confirmed that serious problems can be caused in the process of analyzing the dominant sound attributes in terms of psycho-acoustics according to the type of a microphone and a playback system.

  • PDF

Prosodic features and discourse functions of discourse marker 'mak'('막') ('막'의 운율적 특성과 담화적 기능)

  • Song, Inseong
    • Korean Linguistics
    • /
    • v.65
    • /
    • pp.211-236
    • /
    • 2014
  • The aim of this study is to investigate categorical characteristics of 'mak' and their discourse functions through analyzed the prosodic features of 'mak'. The previous studies of 'mak' focused on grammatical or semantic characteristics, but this study focuses on the prosodic features of 'mak' based on speech data. As a result, adverb 'mak' and discourse marker 'mak' are distinguished from prosodic boundary, duration, pause and sort of number tonal patterns. Functions of discourse marker 'mak' is as follows: Maintenance of utterance, Attention, Delay, Expression negative manner. These functions have salient prosodic features related to their functions. Consequently prosodic features are important to analyze categorical characteristics and to establish functions of 'mak'.

A Comparative Study on the Characteristics of the Prosodic Phrases between Autism Spectrum Disorder and Normal Children in the Reading of Korean Read Sentences (자폐 범주성 장애아동과 정상아동의 평서문 읽기에서의 운율구 특성 비교)

  • Jung, Kum-Soo;Seong, Cheol-Jae
    • MALSORI
    • /
    • no.65
    • /
    • pp.51-65
    • /
    • 2008
  • The aim of this study is to compare ASD (Autism Spectrum Disorder) children with normal children in terms of the prosodic features. Materials are collected by the reading of Korean read sentences. They are composed of 10 declarative sentences, each of which was consisted of 5-6 words. Subjects are consisted of 10 ASD and 10 normal male children with a receptive vocabulary age of 5;0-6;5 years. We found out that both groups showed the differences not only in the tonal patterns at the end of the prosodic phrases, but also in both the degree of rising and falling slope related to pitch contour. While HL% and HLH% were highly emerged in sentence final position in normal group, HL% and HLH% were prominent in ASD group in the same position. LH% and LHL% IP types were observed only in ASD group in sentence medial position. The slope showing the variation in the fundamental frequency at the end of the prosodic phrase was twice as steep in the group of ASD children as in the group of normal children.

  • PDF

Two Dimensional Numerical Study in Gangway of Next Generation High Speed Train For Reduction of Aero-acoustic Noise (차세대 고속전철 차량연결부의 저소음 형상설계를 위한 차량연결부의 2차원적 수치해석 연구)

  • Kang, Hyung-Min;Kim, Cheol-Wan;Cho, Tae-Hwan;Jeon, Wan-Ho;Yun, Su-Hwan;Kwon, Hyeok-Bin;Park, Chun-Su
    • Journal of the Korean Society for Railway
    • /
    • v.14 no.4
    • /
    • pp.327-332
    • /
    • 2011
  • As the preceding research for the design of gangway in the next generation high speed train, the aero-acoustic noise at the gangway is calculated. For this purpose, the shape of gangway with mud flaps is assumed as the two-dimensional cavity. Then, 5 gap sizes between mud flaps of gangway are selected and parametric study is performed according to the gap sizes. From this study, the aerodynamic features such as vortex shedding, pressure, etc. are computed. Also, the aero-acoustic properties of tonal noise and overall noise are analyzed at the 3 locations of microphone and the relation between the gap size of mud flap and the noise level is assessed. Through this study, it is shown that the noise characteristics of base and specific models are better than those of other models.

Separation of passive sonar target signals using frequency domain independent component analysis (주파수영역 독립성분분석을 이용한 수동소나 표적신호 분리)

  • Lee, Hojae;Seo, Iksu;Bae, Keunsung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.35 no.2
    • /
    • pp.110-117
    • /
    • 2016
  • Passive sonar systems detect and classify the target by analyzing the radiated noises from vessels. If multiple noise sources exist within the sonar detection range, it gets difficult to classify each noise source because mixture of noise sources are observed. To overcome this problem, a beamforming technique is used to separate noise sources spatially though it has various limitations. In this paper, we propose a new method that uses a FDICA (Frequency Domain Independent Component Analysis) to separate noise sources from the mixture. For experiments, each noise source signal was synthesized by considering the features such as machinery tonal components and propeller tonal components. And the results of before and after separation were compared by using LOFAR (Low Frequency Analysis and Recording), DEMON (Detection Envelope Modulation On Noise) analysis.

Modality-Based Sentence-Final Intonation Prediction for Korean Conversational-Style Text-to-Speech Systems

  • Oh, Seung-Shin;Kim, Sang-Hun
    • ETRI Journal
    • /
    • v.28 no.6
    • /
    • pp.807-810
    • /
    • 2006
  • This letter presents a prediction model for sentence-final intonations for Korean conversational-style text-to-speech systems in which we introduce the linguistic feature of 'modality' as a new parameter. Based on their function and meaning, we classify tonal forms in speech data into tone types meaningful for speech synthesis and use the result of this classification to build our prediction model using a tree structured classification algorithm. In order to show that modality is more effective for the prediction model than features such as sentence type or speech act, an experiment is performed on a test set of 970 utterances with a training set of 3,883 utterances. The results show that modality makes a higher contribution to the determination of sentence-final intonation than sentence type or speech act, and that prediction accuracy improves up to 25% when the feature of modality is introduced.

  • PDF