Search | Korea Science

Classification of TV Program Scenes Based on Audio Information

Lee, Kang-Kyu;Yoon, Won-Jung;Park, Kyu-Sik
- The Journal of the Acoustical Society of Korea
- /
- v.23 no.3E
- /
- pp.91-97
- /
- 2004
In this paper, we propose a classification system of TV program scenes based on audio information. The system classifies the video scene into six categories of commercials, basketball games, football games, news reports, weather forecasts and music videos. Two type of audio feature set are extracted from each audio frame-timbral features and coefficient domain features which result in 58-dimensional feature vector. In order to reduce the computational complexity of the system, 58-dimensional feature set is further optimized to yield l0-dimensional features through Sequential Forward Selection (SFS) method. This down-sized feature set is finally used to train and classify the given TV program scenes using κ -NN, Gaussian pattern matching algorithm. The classification result of 91.6% reported here shows the promising performance of the video scene classification based on the audio information. Finally, the system stability problem corresponding to different query length is investigated.
PDF KSCI

A Study on Background Speaker Selection Method in Speaker Verification System (화자인증 시스템에서 선정 방법에 관한 연구)

Choi, Hong-Sub
- Speech Sciences
- /
- v.9 no.2
- /
- pp.135-146
- /
- 2002
Generally a speaker verification system improves its system recognition ratio by regularizing log likelihood ratio, using a speaker model and its background speaker model that are required to be verified. The speaker-based cohort method is one of the methods that are widely used for selecting background speaker model. Recently, Gaussian-based cohort model has been suggested as a virtually synthesized cohort model, and unlike a speaker-based model, this is the method that chooses only the probability distributions close to basic speaker's probability distribution among the several neighboring speakers' probability distributions and thereby synthesizes a new virtual speaker model. It shows more excellent results than the existing speaker-based method. This study compared the existing speaker-based background speaker models and virtual speaker models and then constructed new virtual background speaker model groups which combined them in a certain ratio. For this, this study constructed a speaker verification system that uses GMM (Gaussin Mixture Model), and found that the suggested method of selecting virtual background speaker model shows more improved performance.
PDF

Livestock Theft Detection System Using Skeleton Feature and Color Similarity (골격 특징 및 색상 유사도를 이용한 가축 도난 감지 시스템)

Kim, Jun Hyoung;Joo, Yung Hoon
- The Transactions of The Korean Institute of Electrical Engineers
- /
- v.67 no.4
- /
- pp.586-594
- /
- 2018
In this paper, we propose a livestock theft detection system through moving object classification and tracking method. To do this, first, we extract moving objects using GMM(Gaussian Mixture Model) and RGB background modeling method. Second, it utilizes a morphology technique to remove shadows and noise, and recognizes moving objects through labeling. Third, the recognized moving objects are classified into human and livestock using skeletal features and color similarity judgment. Fourth, for the classified moving objects, CAM (Continuously Adaptive Meanshift) Shift and Kalman Filter are used to perform tracking and overlapping judgment, and risk is judged to generate a notification. Finally, several experiments demonstrate the feasibility and applicability of the proposed method.
https://doi.org/10.5370/KIEE.2018.67.4.586 인용 PDF KSCI

근전도신호를 이용한 노약자/장애인용 재활 보조시스템의 인터페이스기법

장영건;신철규;이은실;권장우;홍승홍
- Proceedings of the ESK Conference
- /
- 1997.04a
- /
- pp.107-113
- /
- 1997
In this paper, an interfacing method to control rehabilitation assitance system with bio-signal is proposed. Controlling with EMG signals method has certain advantage on signal-collecting, but has some drawbacks in the function resolution of EMG signals because data-processing process is not efficient. To improve function-resolution and to increase the efficiency of EMG signal interfacing with rehabilitation assistance system, Multi-layer Perception which is highly effective with static signal and hidden-Markov model for dynamic signal resolving are fused together. In proposed method. The direction and average speed of the rehabilitation assitance system are controlled by the trajectory control and estimation of the moving direction result from the fused model. From the experiment, proposed GMM and 2-level MLP hybrid-classifier yielded 8.6% perception-error rate, improving function resolution. New acceleration control method constructed with 3 nested linear filter produced continuous acceleration paths without the information of destination point. Thus, the mass output caused by non- continuous acceleration-deceleration was eliminated. In the simulation, the necessary calculation, in the case of multiplication, was reduced by 11.54%.
PDF

APPLICATION OF GIANT MAGNETOSTRICTIVE MATERIAL TO DISC BRAKE ACTUATOR

OGAWA, Yutaka;MURATA, Yukio;KAWASE, Kazuo;WAKIWAKA, Hiroyuki;MIZUNO, Tsutomu;YAMADA, Hajime
- Proceedings of the KIPE Conference
- /
- 1998.10a
- /
- pp.560-563
- /
- 1998
For the next generation railway brake system, a disc brake which can be operated directly and electrically is strongly expected. This paper deals with newly developed disc brake actuator using giant magnetostrictive materials(GMM) which can be integrated with disc brake. Regarding the brake system performance, a better delay time was also attained which can be integrated with disc brake. Regarding the brake system performance, a better delay time was also attained which will contribute to shorten a stopping distance.
PDF

Classification of Phornographic Videos Based on the Audio Information (오디오 신호에 기반한 음란 동영상 판별)

Kim, Bong-Wan;Choi, Dae-Lim;Lee, Yong-Ju
- MALSORI
- /
- no.63
- /
- pp.139-151
- /
- 2007
As the Internet becomes prevalent in our lives, harmful contents, such as phornographic videos, have been increasing on the Internet, which has become a very serious problem. To prevent such an event, there are many filtering systems mainly based on the keyword-or image-based methods. The main purpose of this paper is to devise a system that classifies pornographic videos based on the audio information. We use the mel-cepstrum modulation energy (MCME) which is a modulation energy calculated on the time trajectory of the mel-frequency cepstral coefficients (MFCC) as well as the MFCC as the feature vector. For the classifier, we use the well-known Gaussian mixture model (GMM). The experimental results showed that the proposed system effectively classified 98.3% of pornographic data and 99.8% of non-pornographic data. We expect the proposed method can be applied to the more accurate classification system which uses both video and audio information.
PDF

Combination of Classifiers Decisions for Multilingual Speaker Identification

Nagaraja, B.G.;Jayanna, H.S.
- Journal of Information Processing Systems
- /
- v.13 no.4
- /
- pp.928-940
- /
- 2017
State-of-the-art speaker recognition systems may work better for the English language. However, if the same system is used for recognizing those who speak different languages, the systems may yield a poor performance. In this work, the decisions of a Gaussian mixture model-universal background model (GMM-UBM) and a learning vector quantization (LVQ) are combined to improve the recognition performance of a multilingual speaker identification system. The difference between these classifiers is in their modeling techniques. The former one is based on probabilistic approach and the latter one is based on the fine-tuning of neurons. Since the approaches are different, each modeling technique identifies different sets of speakers for the same database set. Therefore, the decisions of the classifiers may be used to improve the performance. In this study, multitaper mel-frequency cepstral coefficients (MFCCs) are used as the features and the monolingual and cross-lingual speaker identification studies are conducted using NIST-2003 and our own database. The experimental results show that the combined system improves the performance by nearly 10% compared with that of the individual classifier.
https://doi.org/10.3745/JIPS.02.0025 인용 PDF KSCI

An Acoustic Event Detection Method in Tunnels Using Non-negative Tensor Factorization and Hidden Markov Model (비음수 텐서 분해와 은닉 마코프 모델을 이용한 터널 환경에서의 음향 사고 검지 방법)

Kim, Nam Kyun;Jeon, Kwang Myung;Kim, Hong Kook
- Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
- /
- v.8 no.9
- /
- pp.265-273
- /
- 2018
In this paper, we propose an acoustic event detection method in tunnels using non-negative tensor factorization (NTF) and hidden Markov model (HMM) applied to multi-channel audio signals. Incidents in tunnel are inherent to the system and occur unavoidably with known probability. Incidents can easily happen minor accidents and extend right through to major disaster. Most incident detection systems deploy visual incident detection (VID) systems that often cause false alarms due to various constraints such as night obstacles and a limit of viewing angle. To this end, the proposed method first tries to separate and detect every acoustic event, which is assumed to be an in-tunnel incident, from noisy acoustic signals by using an NTF technique. Then, maximum likelihood estimation using Gaussian mixture model (GMM)-HMMs is carried out to verify whether or not each detected event is an actual incident. Performance evaluation shows that the proposed method operates in real time and achieves high detection accuracy under simulated tunnel conditions.
https://doi.org/10.21742/AJMAHS.2018.09.66 인용

A Neuro-Fuzzy System Modeling using Gaussian Mixture Model and Clustering Method (GMM과 클러스터링 기법에 의한 뉴로-퍼지 시스템 모델링)

Kim, Sung-Suk;Kwak, Keun-Chang;Ryu, Jeong-Woong;Chun, Myung-Geun
- Journal of the Korean Institute of Intelligent Systems
- /
- v.12 no.6
- /
- pp.571-576
- /
- 2002
There have been a lot of considerations dealing with improving the performance of neuro-fuzzy system. The studies on the neuro-fuzzy modeling have largely been devoted to two approaches. First is to improve performance index of system. The other is to reduce the structure size. In spite of its satisfactory result, it should be noted that these are difficult to extend to high dimensional input or to increase the membership functions. We propose a novel neuro-fuzzy system based on the efficient clustering method for initializing the parameters of the premise part. It is a very useful method that maintains a few number of rules and improves the performance. It combine the various algorithms to improve the performance. The Expectation-Maximization algorithm of Gaussian mixture model is an efficient estimation method for unknown parameter estimation of mirture model. The obtained parameters are used for fuzzy clustering method. The proposed method satisfies these two requirements using the Gaussian mixture model and neuro-fuzzy modeling. Experimental results indicate that the proposed method is capable of giving reliable performance.
https://doi.org/10.5391/JKIIS.2002.12.6.571 인용 PDF KSCI

An Implementation of Automatic Genre Classification System for Korean Traditional Music (한국 전통음악 (국악)에 대한 자동 장르 분류 시스템 구현)

Lee Kang-Kyu;Yoon Won-Jung;Park Kyu-Sik
- The Journal of the Acoustical Society of Korea
- /
- v.24 no.1
- /
- pp.29-37
- /
- 2005
This paper proposes an automatic genre classification system for Korean traditional music. The Proposed system accepts and classifies queried input music as one of the six musical genres such as Royal Shrine Music, Classcal Chamber Music, Folk Song, Folk Music, Buddhist Music, Shamanist Music based on music contents. In general, content-based music genre classification consists of two stages - music feature vector extraction and Pattern classification. For feature extraction. the system extracts 58 dimensional feature vectors including spectral centroid, spectral rolloff and spectral flux based on STFT and also the coefficient domain features such as LPC, MFCC, and then these features are further optimized using SFS method. For Pattern or genre classification, k-NN, Gaussian, GMM and SVM algorithms are considered. In addition, the proposed system adopts MFC method to settle down the uncertainty problem of the system performance due to the different query Patterns (or portions). From the experimental results. we verify the successful genre classification performance over $97{\%}$ for both the k-NN and SVM classifier, however SVM classifier provides almost three times faster classification performance than the k-NN.
PDF KSCI

Search Result 193, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)