Search | Korea Science

Feature Parameter Extraction and Analysis in the Wavelet Domain for Discrimination of Music and Speech (음악과 음성 판별을 위한 웨이브렛 영역에서의 특징 파라미터)

Kim, Jung-Min;Bae, Keun-Sung
- MALSORI
- /
- no.61
- /
- pp.63-74
- /
- 2007
Discrimination of music and speech from the multimedia signal is an important task in audio coding and broadcast monitoring systems. This paper deals with the problem of feature parameter extraction for discrimination of music and speech. The wavelet transform is a multi-resolution analysis method that is useful for analysis of temporal and spectral properties of non-stationary signals such as speech and audio signals. We propose new feature parameters extracted from the wavelet transformed signal for discrimination of music and speech. First, wavelet coefficients are obtained on the frame-by-frame basis. The analysis frame size is set to 20 ms. A parameter $E_{sum}$ is then defined by adding the difference of magnitude between adjacent wavelet coefficients in each scale. The maximum and minimum values of $E_{sum}$ for period of 2 seconds, which corresponds to the discrimination duration, are used as feature parameters for discrimination of music and speech. To evaluate the performance of the proposed feature parameters for music and speech discrimination, the accuracy of music and speech discrimination is measured for various types of music and speech signals. In the experiment every 2-second data is discriminated as music or speech, and about 93% of music and speech segments have been successfully detected.
PDF

A Comparison of Speech/Music Discrimination Features for Audio Indexing (오디오 인덱싱을 위한 음성/음악 분류 특징 비교)

이경록;서봉수;김진영
- The Journal of the Acoustical Society of Korea
- /
- v.20 no.2
- /
- pp.10-15
- /
- 2001
In this paper, we describe the comparison between the combination of features using a speech and music discrimination, which is classifying between speech and music on audio signals. Audio signals are classified into 3classes (speech, music, speech and music) and 2classes (speech, music). Experiments carried out on three types of feature, Mel-cepstrum, energy, zero-crossings, and try to find a best combination between features to speech and music discrimination. We using a Gaussian Mixture Model (GMM) for discrimination algorithm and combine different features into a single vector prior to modeling the data with a GMM. In 3classes, the best result is achieved using Mel-cepstrum, energy and zero-crossings in a single feature vector (speech: 95.1％, music: 61.9％, speech & music: 55.5％). In 2classes, the best result is achieved using Mel-cepstrum, energy and Mel-cepstrum, energy, zero-crossings in a single feature vector (speech: 98.9％, music: 100％).
PDF

The Audio Signal Classification System Using Contents Based Analysis

Lee, Kwang-Seok;Kim, Young-Sub;Han, Hag-Yong;Hur, Kang-In
- Journal of information and communication convergence engineering
- /
- v.5 no.3
- /
- pp.245-248
- /
- 2007
In this paper, we research the content-based analysis and classification according to the composition of the feature parameter data base for the audio data to implement the audio data index and searching system. Audio data is classified to the primitive various auditory types. We described the analysis and feature extraction method for the feature parameters available to the audio data classification. And we compose the feature parameters data base in the index group unit, then compare and analyze the audio data centering the including level around and index criterion into the audio categories. Based on this result, we compose feature vectors of audio data according to the classification categories, and simulate to classify using discrimination function.
PDF KSCI

A Study on Acoustic Signal Characterization for Al and Steel Machining by Audio Deep Learning (오디오 딥러닝을 활용한 Al, Steel 소재의 절삭 깊이에 따른 오디오 판별)

Kim, Tae-won;Lee, Young Min;Choi, Hae-Woon
- Journal of the Korean Society of Manufacturing Process Engineers
- /
- v.20 no.7
- /
- pp.72-79
- /
- 2021
This study reports on the experiment of using deep learning algorithms to determine the machining process of aluminium and steel. A face cutting milling tool was used for machining and the cutting speed was set between 3 and 4 mm/s. Both materials were machined with a depth to 0.5mm and 1.0mm. To demonstrate the developed deep learning algorithm, simulation experiments were performed using the VGGish algorithm in MATLAB toobox. Downcutting was used to cut aluminum and steel as a machining process for high quality and precise learning. As a result of learning algorithms using audio data, 61%-99% accuracy was obtained in four categories: Al 0.5mm, Al 1.0mm, Steel 0.5mm and Steel 1.0mm. Audio discrimination using deep learning is derived as a probabilistic result.
https://doi.org/10.14775/ksmpe.2021.20.07.072 인용 PDF KSCI

Development of Processing System for Audio-vision System Based on Auditory Input (청각을 이용한 시각 재현 시스템의 개발)

Kim, Jung-Hun;Kim, Deok-Kyu;Won, Chul-Ho;Lee, Jong-Min;Lee, Hee-Jung;Lee, Na-Hee;Yoon, Su-Young
- Journal of Biomedical Engineering Research
- /
- v.33 no.1
- /
- pp.25-31
- /
- 2012
The audio vision system was developed for visually impaired people and usability was verified. In this study ten normal volunteers were included in the subject group and their mean age was 28.8 years old. Male and female ratio was 7:3. The usability of audio vision system was verified by as follows. First, volunteers learned distance of obstacles and up-down discrimination. After learning of audio vision system, indoor and outdoor walking examination was performed. The test was scored by ability of up-down and lateral discrimination, distance recognition and walking without collision. Each parameter was scored by 1 to 5. The results were 93.5 +- SD(ranges, 86 to 100) of 100. In this study, we could convert visual information to auditory information by audio-vision system and verified possibility of applying to daily life for visually impaired people.
https://doi.org/10.9718/JBER.2012.33.1.025 인용 PDF KSCI

A Study of Automatic Detection of Music Signal from Broadcasting Audio Signal (방송 오디오 신호로부터 음악 신호 검출에 관한 연구)

Yoon, Won-Jung;Park, Kyu-Sik
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.47 no.5
- /
- pp.81-88
- /
- 2010
In this paper, we proposed an automatic music/non-music signal discrimination system from broadcasting audio signal as a preliminary study of building a sound source monitoring system in real broadcasting environment. By reflecting human speech articulation characteristics, we used three simple time-domain features such as energy standard deviation, log energy standard deviation and log energy mean. Based on the experimental threshold values of each feature, we developed a rule-based algorithm to classify music portion of the input audio signal. For the verification of the proposed algorithm, actual FM broadcasting signal was recorded for 24 hours and used as source input audio signal. From the experimental results, the proposed system can effectively recognize music section with the accuracy of 96% and non-music section with that of 87%, where the performance is good enough to be used as a pre-process module for the a sound source monitoring system.
PDF KSCI

Comparison & Analysis of Speech/Music Discrimination Features through Experiments (실험에 의한 음성·음악 분류 특징의 비교 분석)

Lee, Kyung-Rok;Ryu, Shi-Woo;Gwark, Jae-Young
- Proceedings of the Korea Contents Association Conference
- /
- 2004.11a
- /
- pp.308-313
- /
- 2004
In this paper, we compared and analyzed the discrimination performance of speech/music about combinations of each features parameter. Audio signals are classified into 3 classes (speech, music, speech and music). On three types of features, Mel-cepstrum, energy, zero-crossings used to the experiments. Then compared and analyzed the best of the combinations between features to speech/ music discrimination performance. The best result is achieved using Mel-cepstrum, energy and zero-crossings in a single feature vector (speech: 95.1%, music: 61.9%, speech & music: 55.5%).
PDF

Content Based Classification of Audio Signal using Discriminant Function (식별함수를 이용한 오디오신호의 내용기반 분류)

Kim, Young-Sub;Lee, Kwang-Seok;Koh, Si-Young;Hur, Kang-In
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2007.06a
- /
- pp.201-204
- /
- 2007
In this paper, we research the content-based analysis and classification according to the composition of the feature parameters pool for the auditory signals to implement the auditory indexing and searching system. Auditory data is classified to the primitive various auditory types. we described the analysis and feature extraction method for the feature parameters available to the auditory data classification. And we compose the feature parameters pool in the indexing group unit, then compare and analysis the auditory data centering around the including level and indexing criterion into the audio categories. Based on this result, we composit feature vectors of audio data according to the classification categories, then experiment the classification using discrimination function.
PDF

Implementation of Music Signals Discrimination System for FM Broadcasting (FM 라디오 환경에서의 실시간 음악 판별 시스템 구현)

Kang, Hyun-Woo
- The KIPS Transactions:PartB
- /
- v.16B no.2
- /
- pp.151-156
- /
- 2009
This paper proposes a Gaussian mixture model(GMM)-based music discrimination system for FM broadcasting. The objective of the system is automatically archiving music signals from audio broadcasting programs that are normally mixed with human voices, music songs, commercial musics, and other sounds. To improve the system performance, make it more robust and to accurately cut the starting/ending-point of the recording, we also added a post-processing module. Experimental results on various input signals of FM radio programs under PC environments show excellent performance of the proposed system. The fixed-point simulation shows the same results under 3MIPS computational power.
https://doi.org/10.3745/KIPSTB.2009.16-B.2.151 인용 PDF KSCI

A Study on Image Retrieval Using Sound Classifier (사운드 분류기를 이용한 영상검색에 관한 연구)

Kim, Seung-Han;Lee, Myeong-Sun;Roh, Seung-Yong
- Proceedings of the KIEE Conference
- /
- 2006.10c
- /
- pp.419-421
- /
- 2006
The importance of automatic discrimination image data has evolved as a research topic over recent years. We have used forward neural network as a classifier using sound data features within image data, our initial tests have shown encouraging results that indicate the viability of our approach.
PDF

Search Result 23, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)