• 제목/요약/키워드: Spectrogram

검색결과 236건 처리시간 0.021초

숫자음의 스펙트럼 차이값과 상관계수를 이용한 화자인증 파라미터 연구 (A Study on Speaker Identification Parameter Using Difference and Correlation Coeffieicent of Digit_sound Spectrum)

  • 이후동;강선미;장문수;양병곤
    • 음성과학
    • /
    • 제11권3호
    • /
    • pp.131-142
    • /
    • 2004
  • Speaker identification system basically functions by comparing spectral energy of an individual production model with that of an input signal. This study aimed to develop a new speaker identification system from two parameters from the spectral energy of numeric sounds: difference sum and correlation coefficient. A narrow-band spectrogram yielded more stable spectral energy across time than a wide-band one. In this paper, we collected empirical data from four male speakers and tested the speaker identification system. The subjects produced 18 combinations of three-digit numeric. sounds !en times each. Five productions of each three-digit number were statistically averaged to make a model for each speaker. Then, the remaining five productions were tested on the system. Results showed that when the threshold for the absolute difference sum was set to 1200, all the speakers could not pass the system while everybody could pass if set to 2800. The minimum correlation coefficient to allow all to pass was 0.82 while the coefficient of 0.95 rejected all. Thus, both threshold levels can be adjusted to the need of speaker identification system, which is desirable for further study.

  • PDF

커널 스펙트럼 모델 backfitting 기반의 로그 스펙트럼 진폭 추정을 적용한 배경음과 보컬음 분리 (Music and Voice Separation Using Log-Spectral Amplitude Estimator Based on Kernel Spectrogram Models Backfitting)

  • 이준용;김형국
    • 한국음향학회지
    • /
    • 제34권3호
    • /
    • pp.227-233
    • /
    • 2015
  • 본 논문은 커널 스펙트럼 모델 backfitting 기반의 로그 스펙트럼 진폭 추정부를 적용한 배경음과 보컬음 분리를 제안한다. 기존의 커널 스펙트럼 모델 기반의 배경음과 보컬음 분리는 추출하고자하는 객체의 모델을 기반으로 위너형태의 평균 제곱의 오차의 이득값을 학습함으로써 배경음과 보컬음을 분리하는 기술이다. 본 논문은 기존의 커널 스펙트럴 모델 기반의 배경음과 보컬음 분리 방식에서 위너형태의 이득값 대신 로그 스펙트럼 진폭 추정을 적용하여 기존 방식 보다 명료한 배경음과 보컬음을 추출한다. 실험결과는 본 논문에서 제안한 방식이 기존의 방식들보다 더 우수하다는 것을 보인다.

진동센서 기반 걸음걸이 검출 및 분류 알고리즘 (Footstep Detection and Classification Algorithms based Seismic Sensor)

  • 강윤정;이재일;배진호;이종현
    • 전자공학회논문지
    • /
    • 제52권1호
    • /
    • pp.162-172
    • /
    • 2015
  • 본 논문에서는 적응형 걸음걸이 검출 알고리즘과 검출된 신호로부터 단일 발자국의 움직임을 분류하는 알고리즘을 제안한다. 제안된 단일 발자국 기반 알고리즘은 기존의 연속된 발자국 신호를 이용한 분류 방식이 아니기 때문에 전체적인 움직임뿐만 아니라 개별적이고 불규칙한 움직임도 검출 및 분류 가능하다. 분류를 위해 사용된 특징벡터는 발자국 신호의 푸리에 스펙트럼, CWT의 스펙트럼, AR 모델링 스펙트럼과 AR 스펙트로그램 영상으로부터 얻어진 벡터이다. SVM을 이용하여 단일 발자국의 움직임을 분류한 결과 AR 스펙트로그램으로 얻어진 특징벡터를 사용할 경우 90% 이상 분류 성능을 얻었다.

뇌 손상 후 실어증 환자의 언어치료 프로그램 kMIT의 개발 및 임상적 효과 (Development of Speech-Language Therapy Program kMIT for Aphasic Patients Following Brain Injury and Its Clinical Effects)

  • 김현기;김연희;고명환;박종호;김선숙
    • 음성과학
    • /
    • 제9권4호
    • /
    • pp.237-252
    • /
    • 2002
  • MIT has been applied for nonfluent aphasic patients on the basis of lateralization of brain hemisphere. However, its applications for different languages have some inquiry for aphasic patients because of prosodic and rhythmic differences. The purpose of this study is to develop the Korean Melodic Intonation Therapy program using personal computer and its clinical effects for nonfluent aphasic patients. The algorithm was composed to voice analog signal, PCM, AMDF, Short-time autocorrelation function and center clipping. The main menu contains pitch, waveform, sound intensity and speech files on window. Aphasic patients' intonation patterns overlay on selected kMIT patterns. Three aphasic patients with or without kMIT training participated in this study. Four affirmative sentences and two interrogative sentences were uttered on CSL by stimulus of ST. VOT, VD, Hold and TD were measured on Spectrogram. In addition, articulation disorders and intonation patterns were evaluated objectively on spectrogram. The results indicated that nonfluent aphasic patients with kMIT training group showed some clinical effects of speech intelligibility based on VOT, TD values, articulation evaluation and prosodic pattern changes.

  • PDF

성대형태 및 음향발현에서 성악 발성 및 판소리 발성의 비교 연구 (A Comparative Study of Western Singer's Voice and a Pansori Singer's Voice Based on Glottal Image and Acoustic Characteristics)

  • 김선숙
    • 음성과학
    • /
    • 제11권2호
    • /
    • pp.165-177
    • /
    • 2004
  • Western singers voice have been studied in music science since the early 20th century. However, Korean traditional singers voice have not yet been studied scientifically. This study is to find the physiological and acoustic characteristics of Pansori singers voices. Western singers participated for comparative purposes. Ten western singers and ten Pansori singers participated in this study. The subjects spoke and sung seven simple vowels /a, e, i, o, u, c, w/. An analysis of Glottal image was done by Scope View and acoustic characteristics of speech and singing voice were analyzed by CSL. The results are as follows: (1) Glottal gestures of Pansori singers showed asymmetric vocal folds. (2) Singing vowel formants of Pansori singers showed breathiness based on Spectrogram. (3) Music formant of western singers appeared in around 3kHz area, however, Pansori singers formant appeared in low frequency area. Modulation of vibrato showed 6 frequency per sec in case of western singers. Pansori singers showed no deep modulation of vibrato on spectrogram.

  • PDF

편도적출술로 음성변화가 올 수 있는 편도 상태에 관한 연구 (The Study of Tonsil Affected Voice Quality after Tonsillectomy)

  • 안철민;정덕희
    • 대한후두음성언어의학회지
    • /
    • 제9권1호
    • /
    • pp.32-37
    • /
    • 1998
  • Tonsillectomy is the one of operation that is performed the most commonly in otolaryngology field. Many changes that include range of voice, tone, voice quality and resonance were made by tonsillectomy. Sometimes, any patients taken tonsillectomy has suffer from these voice problem after tonsillectomy. However there are less study for these problems until now. Then, we studied to find the anatomical findings that affected the voice quality when tonsillectomy was performed. We evaluated the voice in 2 groups, one is the group showed the normal pharyngeal space by using the transnasal fiberscopy, the other is group showed medially bulging tonsil at pharyngeal cavity by using same method, with perceptual evaluation, nasalance score, nasality, oral formant and nasal formant. We used the computerized speech analysis system, the nasometer and the spectrogram in the CSL program. We could not find any differences in perceptual evaluation between two groups. But objective measures were provided. Nasalance score and nasality on the nasometric analysis were increased significantly and oral formant on the spectrogram was changed singnificantly after tonsillectomy in Group 2. Authors thought medially bulging tonsil in the pharynx is able to affect the voice quality after tonsillectomy when we evaluted through the nasal cavity by the using of fiberscopy and this evaluation would be important especially in singers.

  • PDF

한국어 방언 음성의 실험적 연구 (An Experimental Study of Korean Dialectal Speech)

  • 김현기;최영숙;김덕수
    • 음성과학
    • /
    • 제13권3호
    • /
    • pp.49-65
    • /
    • 2006
  • Recently, several theories on the digital speech signal processing expanded the communication boundary between human beings and machines drastically. The aim of this study is to collect dialectal speech in Korea on a large scale and to establish a digital speech data base in order to provide the data base for further research on the Korean dialectal and the creation of value-added network. 528 informants across the country participated in this study. Acoustic characteristics of vowels and consonants are analyzed by Power spectrum and Spectrogram of CSL. Test words were made on the picture cards and letter cards which contained each vowel and each consonant in the initial position of words. Plot formants were depicted on a vowel chart and transitions of diphthongs were compared according to dialectal speech. Spectral times, VOT, VD, and TD were measured on a Spectrogram for stop consonants, and fricative frequency, intensity, and lateral formants (LF1, LF2, LF3) for fricative consonants. Nasal formants (NF1, NF2, NF3) were analyzed for different nasalities of nasal consonants. The acoustic characteristics of dialectal speech showed that young generation speakers did not show distinction between close-mid /e/ and open-mid$/\epsilon/$. The diphthongs /we/ and /wj/ showed simple vowels or diphthongs depending to dialect speech. The sibilant sound /s/ showed the aspiration preceded to fricative noise. Lateral /l/ realized variant /r/ in Kyungsang dialectal speech. The duration of nasal consonants in Chungchong dialectal speech were the longest among the dialects.

  • PDF

Implementation of Cough Detection System Using IoT Sensor in Respirator

  • Shin, Woochang
    • International journal of advanced smart convergence
    • /
    • 제9권4호
    • /
    • pp.132-138
    • /
    • 2020
  • Worldwide, the number of corona virus disease 2019 (COVID-19) confirmed cases is rapidly increasing. Although vaccines and treatments for COVID-19 are being developed, the disease is unlikely to disappear completely. By attaching a smart sensor to the respirator worn by medical staff, Internet of Things (IoT) technology and artificial intelligence (AI) technology can be used to automatically detect the medical staff's infection symptoms. In the case of medical staff showing symptoms of the disease, appropriate medical treatment can be provided to protect the staff from the greater risk. In this study, we design and develop a system that detects cough, a typical symptom of respiratory infectious diseases, by applying IoT technology and artificial technology to respiratory protection. Because the cough sound is distorted within the respirator, it is difficult to guarantee accuracy in the AI model learned from the general cough sound. Therefore, coughing and non-coughing sounds were recorded using a sensor attached to a respirator, and AI models were trained and performance evaluated with this data. Mel-spectrogram conversion method was used to efficiently classify sound data, and the developed cough recognition system had a sensitivity of 95.12% and a specificity of 100%, and an overall accuracy of 97.94%.

Proposal of a new method for learning of diesel generator sounds and detecting abnormal sounds using an unsupervised deep learning algorithm

  • Hweon-Ki Jo;Song-Hyun Kim;Chang-Lak Kim
    • Nuclear Engineering and Technology
    • /
    • 제55권2호
    • /
    • pp.506-515
    • /
    • 2023
  • This study is to find a method to learn engine sound after the start-up of a diesel generator installed in nuclear power plant with an unsupervised deep learning algorithm (CNN autoencoder) and a new method to predict the failure of a diesel generator using it. In order to learn the sound of a diesel generator with a deep learning algorithm, sound data recorded before and after the start-up of two diesel generators was used. The sound data of 20 min and 2 h were cut into 7 s, and the split sound was converted into a spectrogram image. 1200 and 7200 spectrogram images were created from sound data of 20 min and 2 h, respectively. Using two different deep learning algorithms (CNN autoencoder and binary classification), it was investigated whether the diesel generator post-start sounds were learned as normal. It was possible to accurately determine the post-start sounds as normal and the pre-start sounds as abnormal. It was also confirmed that the deep learning algorithm could detect the virtual abnormal sounds created by mixing the unusual sounds with the post-start sounds. This study showed that the unsupervised anomaly detection algorithm has a good accuracy increased about 3% with comparing to the binary classification algorithm.

수중 표적 분류를 위한 합성곱 신경망의 전처리 성능 비교 (Preprocessing performance of convolutional neural networks according to characteristic of underwater targets )

  • 박경민;김두영
    • 한국음향학회지
    • /
    • 제41권6호
    • /
    • pp.629-636
    • /
    • 2022
  • 본 논문은 합성곱 신경망 기반 수중 표적 분류기의 성능 향상을 위한 최적의 전처리 기법을 제시한다. 실제 선박 수중신호를 수집한 데이터 세트의 주파수 분석을 통해 강한 저주파 신호로 인한 특성 표현의 문제점을 확인하였다. 이를 해결하기 위해 다양한 스펙트로그램 기법과 특성 스케일링 기법을 조합한 전처리 기법들을 구현하였다. 최적의 전처리 기법을 확인하기 위해 실제 데이터를 기반으로 합성곱 신경망을 훈련하는 실험을 수행하였다. 실험 결과, 로그 멜 스펙트로그램과 표준화 및 로버스트정규화 스케일링 기법의 조합이 높은 인식 성능과 빠른 학습 속도를 보임을 확인하였다.