• Title/Summary/Keyword: 스펙트로그램

Search Result 136, Processing Time 0.028 seconds

Coding History Detection of Speech Signal using Deep Neural Network (심층 신경망을 이용한 음성 신호의 부호화 이력 검출)

  • Cho, Hyo-Jin;Jang, Won;Shin, Seong-Hyeon;Park, Hochong
    • Journal of Broadcast Engineering
    • /
    • v.23 no.1
    • /
    • pp.86-92
    • /
    • 2018
  • In this paper, we propose a method for coding history detection of digital speech signal. In digital speech communication and storage, the signal is encoded to reduce the number of bits. Therefore, when a speech signal waveform is given, we need to detect its coding history so that we can determine whether the signal is an original or an coded one, and if coded, determine the number of times of coding. In this paper, we propose a coding history detection method for 12.2kbps AMR codec in terms of original, single coding, and double coding. The proposed method extracts a speech-specific feature vector from the given speech, and models the feature vector using a deep neural network. We confirm that the proposed feature vector provides better performance in coding history detection than the feature vector computed from the general spectrogram.

Time-Synchronization Method for Dubbing Signal Using SOLA (SOLA를 이용한 더빙 신호의 시간축 동기화)

  • 이기승;지철근;차일환;윤대희
    • Journal of Broadcast Engineering
    • /
    • v.1 no.2
    • /
    • pp.85-95
    • /
    • 1996
  • The purpose of this paper Is to propose a dubbed signal time-synchroniztion technique based on the SOLA(Synchronized Over-Lap and Add) method which has been widely used to modify the time scale of speech signal. In broadcasting audio recording environments, the high degree of background noise requires dubbing process. Since the time difference between the original and the dubbed signal ranges about 200mili seconds, process is required to make the dubbed signal synchronize to the corresponding image. The proposed method finds he starting point of the dubbing signal using the short-time energy of the two signals. Thereafter, LPC cepstrum analysis and DTW(Dynamic Time Warping) process are applied to synchronize phoneme positions of the two signals. After determining the matched point by the minimum mean square error between orignal and dubbed LPC cepstrums, the SOLA method is applied to the dubbed signal, to maintain the consistency of the corresponding phase. Effectiveness of proposed method is verified by comparing the waveforms and the spectrograms of the original and the time synchronized dubbing signal.

  • PDF

Data Analysis of Inertial Sensors for Train Positioning Detection System (열차위치검지 시스템을 위한 관성센서 데이터 분석 연구)

  • Kim, Seong Jin;Park, Sungsoo;Lee, Jae-Ho;Kang, Donghoon
    • Journal of the Korean Society for Nondestructive Testing
    • /
    • v.35 no.1
    • /
    • pp.18-24
    • /
    • 2015
  • Train positioning detection information is fundamental for high-speed railroad inspection, making it possible to simultaneously determine the status and evaluate the integrity of railroad equipment. This paper presents the results of measurements and an analysis of an inertial measurement unit (IMU) used as a positioning detection sensors. Acceleration and angular rate measurements from the IMU were analyzed in the amplitude and frequency domains, with a discussion on vibration and train motions. Using these results and GPS information, the positioning detection of a Korean tilting train express was performed from Naju station to Illo station on the Honam-line. The results of a synchronized analysis of sensor measurements and train motion can help in the design of a train location detection system and improve the positioning detection performance.

Design of the Noise Suppressor Using Wavelet Transform (웨이블릿 변환을 이용한 잡음제거기 설계)

  • 원호진;김종학;이인성
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.7
    • /
    • pp.37-46
    • /
    • 2001
  • This paper proposes a new noise suppression method using the Wavelet transform analysis. The noise suppressor using the Wavelet transform shows the more effective advantages in a babble noise than one using the short-time Fourier transform. We designed a new channel structure based on spectral subtraction of Wavelet transform coefficients and used the Wavelet mask pattern with more higher time resolution in high frequency. It showed a good adaptation capability for babble noise with a non-stationary property. To evaluate the performance of proposed noise canceller, the informal subjective listening tests (Mos tests) were performed in background noise environments (car noise, street noise, babble noise) of mobile communication. The proposed noise suppression algorithm showed about MOS 0.2 performance improvements than the suppression algorithm of EVRC in informal listening tests. The noise reduction by the proposed method was shown in spectrogram of speech signal.

  • PDF

Perception and Production of English Geminate Graphemes by Korean Students (한국 학생들의 영어 겹자음 철자 인지와 발화)

  • Cho, Mi-Hui
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2009.05a
    • /
    • pp.1092-1096
    • /
    • 2009
  • While Korean allows the same consonants at the coda of the preceding syllable and at the onset of the following syllable, English does not allow the geminate consonant in the same position. Due to this difference between Korean and English, Korean learners of English tend to incorrectly produce geminate consonants for English geminate graphemes as in summer. Based on this observation, a pilot study was designed to investigate how Korean learners of English perceive and produce English doubleton graphemes and singleton graphemes. Twenty Korean college students were asked to perform a forced-choice perception test as well as a production test for the 36 real word stimuli which consist of near minimal pairs of singleton and doubleton graphemes. The result showed that the accuracy rates for the word with singleton graphemes were relatively high both in perception and production (78.6% and 76.1%, respectively), while those for the word with doubleton graphemes were low both in perception and production (55.3% and 61.7%, respectively). Also, spectrographic analyses were provided where more production errors were witnessed in doubleton grapheme words than singleton grapheme words.

  • PDF

A Study on a Intelligent GIS Monitoring System using the Preventive Diagnostic Technology (예방진단기술을 이용한 지능형 GIS 감시시스템에 관한 연구)

  • Park, Kee-Young;Lee, Jong-Ha;Cho, Sook-Jin;Choi, Hyung-Ki;Jung, Eui-Bung
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.6
    • /
    • pp.244-251
    • /
    • 2014
  • In this study, we give a detailed account of normal and abnormal state of GIS(Gas Insulated Switch-gear) using the preventive diagnostic technology. And it is based on the analysis and diagnosis for storing data of GIS by intelligent GIS monitoring system. The wave shape of GIS sound is similar to noise and is systematically generated by discharge and its corona sound. Therefore, in this paper, to classify normal and abnormal GIS sound. We could discriminate between normal and abnormal case using level crossing rate(LCR) and spectrogram energy rate.

A Novel Speech Enhancement Based on Speech/Noise-dominant Decision in Time-frequency Domain (시간-주파수 영역에서 음성/잡음 우세 결정에 의한 새로운 잡음처리)

  • 윤석현;유창동
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.3
    • /
    • pp.48-55
    • /
    • 2001
  • A novel method to reduce additive non-stationary noise is proposed. The method requires neither the information about noise nor the estimate of the noise statistics from any pause regions. The enhancement is performed on a band-by-band basis for each time frame. Based on both the decision on whether a particular band in a frame is speech or noise dominant and the masking property of the human auditory system, an appropriate amount of noise is reduced using spectral subtraction. The proposed method was tested on various noisy conditions (car noise, Fl6 noise, white Gaussian noise, pink noise, tank noise and babble noise) and on the basis of comparing segmental SNR with spectral subtraction method and visually inspecting the enhanced spectrograms and listening to the enhanced speech, the method was able to effectively reduce various noise while minimizing distortion to speech.

  • PDF

A Study on the Automatic Detection and Extraction of Narrowband Multiple Frequency Lines (협대역 다중 주파수선의 자동 탐지 및 추출 기법 연구)

  • 이성은;황수복
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.8
    • /
    • pp.78-83
    • /
    • 2000
  • Passive sonar system is designed to classify the underwater targets by analyzing and comparing the various acoustic characteristics such as signal strength, bandwidth, number of tonals and relationship of tonals from the extracted tonals and frequency lines. First of all the precise detection and extraction of signal frequency lines is of particular importance for enhancing the reliability of target classification. But, the narrowband frequency lines which are the line formed in spectrogram by a tonal of constant frequency in each frame can be detected weakly or discontinuously because of the variation of signal strength and transmission loss in the sea. Also, it is very difficult to detect and extract precisely the signal frequency lines by the complexity of impulsive ambient noise and signal components. In this paper, the automatic detection and extraction method that can detect and extract the signal components of frequency tines precisely are proposed. The proposed method can be applied under the bad conditions with weak signal strength and high ambient noise. It is confirmed by the simulation using real underwater target data.

  • PDF

Performance comparison of wake-up-word detection on mobile devices using various convolutional neural networks (다양한 합성곱 신경망 방식을 이용한 모바일 기기를 위한 시작 단어 검출의 성능 비교)

  • Kim, Sanghong;Lee, Bowon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.5
    • /
    • pp.454-460
    • /
    • 2020
  • Artificial intelligence assistants that provide speech recognition operate through cloud-based voice recognition with high accuracy. In cloud-based speech recognition, Wake-Up-Word (WUW) detection plays an important role in activating devices on standby. In this paper, we compare the performance of Convolutional Neural Network (CNN)-based WUW detection models for mobile devices by using Google's speech commands dataset, using the spectrogram and mel-frequency cepstral coefficient features as inputs. The CNN models used in this paper are multi-layer perceptron, general convolutional neural network, VGG16, VGG19, ResNet50, ResNet101, ResNet152, MobileNet. We also propose network that reduces the model size to 1/25 while maintaining the performance of MobileNet is also proposed.

Design of Area-efficient Feature Extractor for Security Surveillance Radar Systems (보안 감시용 레이다 시스템을 위한 면적-효율적인 특징점 추출기 설계)

  • Choi, Yeongung;Lim, Jaehyung;Kim, Geonwoo;Jung, Yunho
    • Journal of IKEEE
    • /
    • v.24 no.1
    • /
    • pp.200-207
    • /
    • 2020
  • In this paper, an area-efficient feature extractor was proposed for security surveillance radar systems and FPGA-based implementation results were presented. In order to reduce the memory requirements, features extracted from Doppler profile for FFT window-size are used, while those extracted from total spectrogram for frame-size are excluded. The proposed feature extractor was design using Verilog-HDL and implemented with Xilinx Zynq-7000 FPGA device. Implementation results show that the proposed design can reduce the logic slice and memory requirements by 58.3% and 98.3%, respectively, compared with the existing research. In addition, security surveillance radar system with the proposed feature extractor was implemented and experiments to classify car, bicycle, human and kickboard were performed. It is confirmed from these experiments that the accuracy of classification is 93.4%.