• Title/Summary/Keyword: Speech signal analysis

Search Result 275, Processing Time 0.035 seconds

A Study of Emotional Variation Tendency by Movie Genre Based on Speech Signal Analysis (음성신호 분석 기반의 영화 장르별 감정변화 특성 연구)

  • Yoo, Hwang-Jun;Han, Sang-Hyo;Kim, Bong-Hyun;Ka, Min-Kyoung;Cho, Dong-Uk
    • Proceedings of the KAIS Fall Conference
    • /
    • 2011.12a
    • /
    • pp.295-298
    • /
    • 2011
  • 인간의 능력 중 가장 주목할 만한 것은 언어를 습득하고 그것을 이용하여 서로 의사소통을 할 수 있다는 것이다. 모든 언어에는 그 언어만이 가지는 특수성뿐만 아니라 공통적으로 존재하는 보편적인 특성이 있다. 이것 외에도 언어 위에 입혀지는 사람의 목소리는 의사소통을 하는데 있어 상대의 심리를 파악하는 중요한 단서가 된다. 특히, 언어는 습득되어져야 활용되고 그 습득되어지는 환경에 영향을 받으며 이러한 환경에 따라 사람의 목소리, 억양 등이 변화하게 되는 것이다. 따라서 본 논문에서는 음성신호 분석 기법을 적용하여 장르별 영화시청에 따른 시각적, 청각적 요인이 목소리에 미치는 영향을 분석하는 연구를 수행하였다. 이를 위해 장르별 영화를 시청한 후 성대 진동 및 음성에너지의 크기 변화를 측정하여 감정변화를 분석하는 실험을 수행하였다.

  • PDF

Sasang Constitution Classification of a Middle-Aged Man Using Speech Signal Analysis (음성 정보 분석값을 통한 장년기 남성의 사상체질 분류)

  • Kim, Bong-Hyun;Lee, Se-Hwan;Park, Sun-Ae;Ka, Min-Kyoung;Cho, Dong-Uk
    • Annual Conference of KIPS
    • /
    • 2007.11a
    • /
    • pp.117-120
    • /
    • 2007
  • 개인의 체질에 맞춰 의학적 행위를 시행하는 사상의학은 우리나라 고유의 전통의학으로 가치를 인정받고 있다. 이러한 사상의학에서 가장 중요한 것은 사상체질의 정확한 분류이다. 본 논문에서는 기존의 사상체질 분류 방법인 용모사기, 체형기상, QSCCII, 체질침 등이 임상의들의 직관에 의해 행해지고 있다는 문제점을 해결하기 위해 사상체질 분류의 정량화 및 객관화를 위한 연구를 수행하였다. 이를 위해 본 논문에서는 음성 신호 분석에서 발생하는 정보의 출력값에 의해 사상 체질을 분류하는 방법을 제안하였다. 이를 위해 40대 이상의 장년기 남성을 대상으로 사상체질 전문의의 진단표에서 뚜렷한 특징을 보유하고 있는 집단군을 구성하고 이들의 음성 특성을 분류하여 음성학적 요소를 추출하고자 한다. 또한 출력된 결과값을 토대로 체질 집단별 차이점과 유사성을 분류하여 사상 체질 분류를 행하였다.

Creation of a Voice Recognition-Based English Aided Learning Platform

  • Hui Xu
    • Journal of Information Processing Systems
    • /
    • v.20 no.4
    • /
    • pp.491-500
    • /
    • 2024
  • In hopes of resolving the issue of poor quality of information input for teaching spoken English online, the study creates an English teaching assistance model based on a recognition algorithm named dynamic time warping (DTW) and relies on automated voice recognition technology. In hopes of improving the algorithm's efficiency, the study modifies the speech signal's time-domain properties during the pre-processing stage and enhances the algorithm's performance in terms of computational effort and storage space. Finally, a simulation experiment is employed to evaluate the model application's efficacy. The study's revised DTW model, which achieves recognition rates of above 95% for all phonetic symbols and tops the list for cloudy consonant recognition with rates of 98.5%, 98.8%, and 98.7% throughout the three tests, respectively, is demonstrated by the study's findings. The enhanced model for DTW voice recognition also presents higher efficiency and requires less time for training and testing. The DTW model's KS value, which is the highest among the models analyzed in the KS value analysis, is 0.63. Among the comparative models, the model also presents the lowest curve position for both test functions. This shows that the upgraded DTW model features superior voice recognition capabilities, which could significantly improve online English education and lead to better teaching outcomes.

Pronunciation Influence Analysis of Carbonate Drink and Eucalyptus Fragrance by Applying Speech Signal Processing Techniques (음성신호 처리 기술을 적용한 탄산음료와 유칼립투스 발향이 발음에 미치는 영향 분석)

  • Kim, Bong-Hyun;Cho, Dong-Uk
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.37 no.5C
    • /
    • pp.420-428
    • /
    • 2012
  • One of the most important means in modern NQ emphasized smart society is the communication skill. Especially, effects on improving pronunciation accuracy, it is mostly necessary to accurately express his or her own idea due to the personal relation influence 38% of voice. For this, this paper proposed the voice influence analysis of carbonate drink and eucalyptus fragrance. In particular, in the case of carbonate drink, the amounts of drinking accumulation is verified for analysing the drinking accumulation influence. Also, eucalyptus fragrance is reported for influencing the pronunciation accuracy. For this, jitter, shimmer, pitch and intensity of voice is analyzed. Finally, we accomplish an voice analysis of quantization, objective and visualization for such carbonate drink and eucalyptus fragrance.

A Study on the Acoustic Characteristics of the Pansori by Voice Signals Analysis (음성신호 분석에 의한 판소리의 음성학적 특징 연구)

  • Kim, HyunSook
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.7
    • /
    • pp.3218-3222
    • /
    • 2013
  • Pansori is our traditional vocal sound, originality and excellence in the art of conversation, gesture general became a globally recognized world intangible heritage. Especially, Pansori as shrews and humorous representation of audience participation with a high degree of artistic value and enjoy the arts throughout all layers to be responsible for the social integration of functions is evaluated. Therefore, in this paper, Pansori five yard target speech signal analysis techniques applied to analyze the Pansori acoustic features of a representation of a society and era correlation extraction studies were performed. Pansori on the five yard spectrogram, pitch, stability and strength analysis for this experiment. Pansori through experimental results Comical story while keeping the audience focused and interested to better reflect the characteristics of energy for the wave of voice and vocal cord tremor change the width of a large, stable and voice with a loud voice, that expresses were analyzed.

Development of medical/electrical convergence software for classification between normal and pathological voices (장애 음성 판별을 위한 의료/전자 융복합 소프트웨어 개발)

  • Moon, Ji-Hye;Lee, JiYeoun
    • Journal of Digital Convergence
    • /
    • v.13 no.12
    • /
    • pp.187-192
    • /
    • 2015
  • If the software is developed to analyze the speech disorder, the application of various converged areas will be very high. This paper implements the user-friendly program based on CART(Classification and regression trees) analysis to distinguish between normal and pathological voices utilizing combination of the acoustical and HOS(Higher-order statistics) parameters. It means convergence between medical information and signal processing. Then the acoustical parameters are Jitter(%) and Shimmer(%). The proposed HOS parameters are means and variances of skewness(MOS and VOS) and kurtosis(MOK and VOK). Database consist of 53 normal and 173 pathological voices distributed by Kay Elemetrics. When the acoustical and proposed parameters together are used to generate the decision tree, the average accuracy is 83.11%. Finally, we developed a program with more user-friendly interface and frameworks.

A Lingual Sound Analysis based on Oriental Medicine Auscultation for Heart Diseases Diagnosis (심장(心臟) 질환(疾患) 진단(診斷)을 위한 한의학적 청진(聽診) 기반의 설음(舌音) 분석)

  • Kim, Bong-Hyun;Cho, Dong-Uk;Her, Sung-Ho
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.8B
    • /
    • pp.830-838
    • /
    • 2009
  • Oriental medicine lacks diagnosis data in fixed quantity possible to express visually to patients by depending on clinician's intuition than Western medicine that continues to development by various diagnosis devices. For that, this paper intends to examine relation between heart and voice signal regarded as center organ and source of life and mind in order to implement objectification through the visualization of oriental diagnosis method above all. According to because the heart is related to the tongue among five organs, by thinking with sounds, we would design the way of identifying existence of heart diseases focused on the fact that lingual sound pronunciation of heart patient is inexact. For this, we achieved a comparison, analysis of statistical bandwidth and morphological modeling of the second formants frequency about a lingual sound for their voice constituted subject group of heart diseases and normal people. Finally, we analyzed interrelationship to the result of experiment by designed method.

Performance Analysis of a Statistical Packet Voice/Data Multiplexer (통계적 패킷 음성 / 데이터 다중화기의 성능 해석)

  • 신병철;은종관
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.11 no.3
    • /
    • pp.179-196
    • /
    • 1986
  • In this paper, the peformance of a statistical packet voice/data multiplexer is studied. In ths study we assume that in the packet voice/data multiplexer two separate finite queues are used for voice and data traffics, and that voice traffic gets priority over data. For the performance analysis we divide the output link of the multiplexer into a sequence of time slots. The voice signal is modeled as an (M+1) - state Markov process, M being the packet generation period in slots. As for the data traffic, it is modeled by a simple Poisson process. In our discrete time domain analysis, the queueing behavior of voice traffic is little affected by the data traffic since voice signal has priority over data. Therefore, we first analyze the queueing behavior of voice traffic, and then using the result, we study the queueing behavior of data traffic. For the packet voice multiplexer, both inpur state and voice buffer occupancy are formulated by a two-dimensional Markov chain. For the integrated voice/data multiplexer we use a three-dimensional Markov chain that represents the input voice state and the buffer occupancies of voice and data. With these models, the numerical results for the performance have been obtained by the Gauss-Seidel iteration method. The analytical results have been verified by computer simylation. From the results we have found that there exist tradeoffs among the number of voice users, output link capacity, voic queue size and overflow probability for the voice traffic, and also exist tradeoffs among traffic load, data queue size and oveflow probability for the data traffic. Also, there exists a tradeoff between the performance of voice and data traffics for given inpur traffics and link capacity. In addition, it has been found that the average queueing delay of data traffic is longer than the maximum buffer size, when the gain of time assignment speech interpolation(TASI) is more than two and the number of voice users is small.

  • PDF

The Analysis and Recognition of Korean Speech Signal using the Phoneme (음소에 의한 한국어 음성의 분석과 인식)

  • Kim, Yeong-Il;Lee, Geon-Gi;Lee, Mun-Su
    • The Journal of the Acoustical Society of Korea
    • /
    • v.6 no.2
    • /
    • pp.38-47
    • /
    • 1987
  • As Korean language can be phonemically classified according to the characteristic and structure of its pronunciation, Korean syllables can be divided into the phonemes such as consonant and vowel. The divided phonemes are analyzed by using the method of partial autocorrelation, and the order of partial autocorelation coefficient is 15. In analysis, it is shown that each characteristic of the same consonants, vowels, and end consonant in syllables in similar. The experiments is carried out by dividing 675 syllables into consonants, vowels, and end consonants. The recognition rate of consonants, vowels, end-consonants, and syllables are $85.0(\%)$, $90.7(\%)$, $85.5(\%)$and $72.1(\%)$ respectively. In conclusion, it is shown that Korean syllables, divided by the phonemes, are analyzed and recognized with minimum data and short processing time. Furthermore, it is shown that Korean syllables, words and sentences are recognized in the same way.

  • PDF

Analysis of the Acoustic Performance of Classrooms in Korea (국내 학교 교실의 실내음향성능 실태조사)

  • Park, Chan-Jae;Ryu, Da-Jung;Kyoung, Ju-Young;Haan, Chan-Hoon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.33 no.5
    • /
    • pp.316-325
    • /
    • 2014
  • The basic unit of school is a classroom and the aural environment of the classrooms is essential factor for education purposes. Therefore, many efforts have been undertaken for enhancing the acoustical performance of the classrooms in many countries. As a result, acoustic criteria including reverberation time and background noise level have been established in US and UK for school classrooms depending on the usage and size of the rooms. However, in Korea, there has been little researches concerning the room acoustical investigations of the classrooms. The present study investigates the current situation of the aural environment of the 15 classrooms in Korea including elementary, middle and high schools. The acoustic criteria measured include RT, $D_{50}$, STI, SNR and background noise level. As the results, it was found that the background noise levels of the schools adjacent to roads exceed the US and UK standard of 35 dB(A). Also, most schools have so low SNR that they may be interfered by noise, which may affect speech transmission. It was also revealed that some schools have longer RT than the US standard of 0.6 s, but they all have high speech intelligibility.