Search | Korea Science

Voice Activity Detection based on Adaptive Band-Partitioning using the Likelihood Ratio (우도비를 이용한 적응 밴드 분할 기반의 음성 검출기)

Kim, Sang-Kyun;Shim, Hyeon-Min;Lee, Sangmin
- Journal of Korea Multimedia Society
- /
- v.17 no.9
- /
- pp.1064-1069
- /
- 2014
In this paper, we propose a novel approach to improve the performance of a voice activity detection(VAD) which is based on the adaptive band-partitioning with the likelihood ratio(LR). The previous method based on the adaptive band-partitioning use the weights that are derived from the variance of the spectral. In our VAD algorithm, the weights are derived from LR, and then the weights are incorporated with the entropy. The proposed algorithm discriminates the voice activity by comparing the weighted entropy with the adaptive threshold. Experimental results show that the proposed algorithm yields better results compared to the conventional VAD algorithms. Especially, the proposed algorithm shows superior improvement in non-stationary noise environments.
https://doi.org/10.9717/kmms.2014.17.9.1064 인용 PDF KSCI KPUBS HTML

A Fast Motion Estimation Algorithm using Adaptive Search According to Importance of Search Ranges (탐색영역의 중요도에 따라 적응적인 탐색을 이용한 고속 움직임 예측 알고리즘)

Kim, Tae Hwan;Kim, Jong Nam;Jeong, Shin Il
- Journal of Korea Multimedia Society
- /
- v.18 no.4
- /
- pp.437-442
- /
- 2015
Voice activity detection is very important process that voice activity separated form noisy speech signal for speech enhance. Over the past few years, many studies have been made on voice activity detection, but it has poor performance in low signal to noise ratio environment or fickle noise such as car noise. In this paper, it proposed new voice activity detection algorithm using ensemble variance based on wavelet band entropy and soft thresholding method. We conduct a survey in a lot of signal to noise ratio environment of car noise to evaluate performance of the proposed algorithm and confirmed performance of the proposed algorithm.
https://doi.org/10.9717/kmms.2015.18.4.437 인용 PDF KSCI KPUBS HTML

지하수의 라듐/라돈 동시측정을 위한 백그라운드 감마선 제어

Lee Gil-Yong;Yun Yun-Yeol;Jo Su-Yeong;Kim Yong-Je
- Proceedings of the Korean Society of Soil and Groundwater Environment Conference
- /
- 2005.04a
- /
- pp.308-311
- /
- 2005
[ $^{222}Rn\;and\;^{226}Ra$ ] in groundwater were determined simultaneously using a gamma-spectroscopy. A nitrogen flushing equipment has been used for elimination and stabilization of high and unstable background activity due to the radon and its progenies in counting shield and room. The aim of present work was to control the background activity for simultaneous measurement of radium$(^{226}Ra)$ and radon$(^{222}Rn)$ in groundwater using a gamma-spectrometry. Background activity was about 1.0dps and the standard deviation was about 50%, The background activity could be minimized using nitrogen flushing equipment in the range of 0.1 to 0.5 and the RSD was about 5% at the experimental condition. The detection limit of $^{222}Rn\;and\;^{226}Ra$ in groundwater was 0.5dps/L in the background control method. In most groundwater used in the work, radon activity was more than the detection limit. However, radium activity in some groundwater was less than the detection limit. If the low level radium in groundwater must be measured, preconcentration process such as concentration should be performed before measuring the groundwater.
PDF

Voice Activity Detection Based on Non-negative Matrix Factorization (비음수 행렬 인수분해 기반의 음성검출 알고리즘)

Kang, Sang-Ick;Chang, Joon-Hyuk
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.35 no.8C
- /
- pp.661-666
- /
- 2010
In this paper, we apply a likelihood ratio test (LRT) to a non-negative matrix factorization (NMF) based voice activity detection (VAD) to find optimal threshold. In our approach, the NMF based VAD is expressed as Euclidean distance between noise basis vector and input basis vector which are extracted through NMF. The optimal threshold each of noise environments depend on NMF results distribution in noise region which is estimated statistical model-based VAD. According to the experimental results, the proposed approach is found to be effective for statistical model-based VAD using LRT.
PDF KSCI

Janus - Multi Source Event Detection and Collection System for Effective Surveillance of Criminal Activity

Shahabi, Cyrus;Kim, Seon Ho;Nocera, Luciano;Constantinou, Giorgos;Lu, Ying;Cai, Yinghao;Medioni, Gerard;Nevatia, Ramakant;Banaei-Kashani, Farnoush
- Journal of Information Processing Systems
- /
- v.10 no.1
- /
- pp.1-22
- /
- 2014
Recent technological advances provide the opportunity to use large amounts of multimedia data from a multitude of sensors with different modalities (e.g., video, text) for the detection and characterization of criminal activity. Their integration can compensate for sensor and modality deficiencies by using data from other available sensors and modalities. However, building such an integrated system at the scale of neighborhood and cities is challenging due to the large amount of data to be considered and the need to ensure a short response time to potential criminal activity. In this paper, we present a system that enables multi-modal data collection at scale and automates the detection of events of interest for the surveillance and reconnaissance of criminal activity. The proposed system showcases novel analytical tools that fuse multimedia data streams to automatically detect and identify specific criminal events and activities. More specifically, the system detects and analyzes series of incidents (an incident is an occurrence or artifact relevant to a criminal activity extracted from a single media stream) in the spatiotemporal domain to extract events (actual instances of criminal events) while cross-referencing multimodal media streams and incidents in time and space to provide a comprehensive view to a human operator while avoiding information overload. We present several case studies that demonstrate how the proposed system can provide law enforcement personnel with forensic and real time tools to identify and track potential criminal activity.
https://doi.org/10.3745/JIPS.2014.10.1.001 인용 PDF KSCI KPUBS HTML

Robust Voice Activity Detection in Noisy Environment Using Entropy and Harmonics Detection (엔트로피와 하모닉 검출을 이용한 잡음환경에 강인한 음성검출)

Choi, Gab-Keun;Kim, Soon-Hyob
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.47 no.1
- /
- pp.169-174
- /
- 2010
This paper explains end-point detection method for better speech recognition rates. The proposed method determines speech and non-speech region with the entropy and the harmonic detection of speech. The end-point detection using entropy on the speech spectral energy has good performance at the high SNR(SNR 15dB) environments. At the low SNR environment(SNR 0dB), however, the threshold level of speech and noise varies, so the precise end-point detection is difficult. Therefore, this paper introduces the end-point detection methods which uses speech spectral entropy and harmonics. Experiment shows better performance than the conventional entropy methods.
PDF KSCI

Telomerase Activity in Benign and Malignant Thyroid Diseases (갑상선 결절의 Telomerase 활성도에 대한 분석)

Park Cheong-Soo;Chung Woong-Youn;Lee Mi-Kyung;Chang Hang-Suk
- Korean Journal of Head & Neck Oncology
- /
- v.14 no.2
- /
- pp.199-205
- /
- 1998
Objective: Telomerase, a specialized ribonucleoprotein polymerase associated with cellular immortality, is expressed by most malignant cells and is inactive in most normal somatic cells. The assays of telomerase activity in various tumors have provided both diagnostic and prognostic information. This study was carried out to determine whether telomerase activity could be useful in distinguishing benign and malignant thyroid diseasees. Materials & Methods: Telomerase activity was determined using Oncor $TRAP_{EZE}^{TM}ELISA$ Telomerase Detection Kit for performing PCR-based telomeric repeat amplification protocol (TRAP) assay followed by ELISA detection in both normal and tumor tissues of 23 adenomatous hyperplasias, 12 follicular adenomas, 4 follicular carcinomas, 16 papillary carcinomas, 4 Hashimoto's thyroiditises and 3 malignant lymphomas. We also examined all cases microscopically to review the status of lymphoid infiltrate. Results: Of the 62 cases, extensive lymphoid infiltrates were contained in 20 tumor tissues(4 Hashimoto's thyroiditises, 3 malignant lymphomas, 6 adenomatous hyperplasias and 7 papillary carcinomas), all of which showed positive telomerase activity. All the normal tissues without lymphoid infiltrates(n=43) did not express telomerase activity. Of 42 tumor tissues without lymphoid infiltrates, 37(88.0%) showed positive telomerase activity: 13 of 17 adenomatous hyperplasias(76.5%), 11 of 12 follicular adenomas(91.7%), 4 of 4 follicular carcinomas(100.0%) and 9 of 9 papillary carcinomas(100.0%). Conclusions: Our methods showed high sensitivity in the detection of telomerase activity and the exclusion of lymphoid infiltrates may be important in telomerase assay. In our work, the measurement of telomerase activity was not useful in distinguishing benign and malignant thyroid diseases.
PDF

Applying the Bi-level HMM for Robust Voice-activity Detection

Hwang, Yongwon;Jeong, Mun-Ho;Oh, Sang-Rok;Kim, Il-Hwan
- Journal of Electrical Engineering and Technology
- /
- v.12 no.1
- /
- pp.373-377
- /
- 2017
This paper presents a voice-activity detection (VAD) method for sound sequences with various SNRs. For real-time VAD applications, it is inadequate to employ a post-processing for the removal of burst clippings from the VAD output decision. To tackle this problem, building on the bi-level hidden Markov model, for which a state layer is inserted into a typical hidden Markov model (HMM), we formulated a robust method for VAD not requiring any additional post-processing. In the method, a forward-inference-ratio test was devised to detect the speech endpoints and Mel-frequency cepstral coefficients (MFCC) were used as the features. Our experiment results show that, regarding different SNRs, the performance of the proposed approach is more outstanding than those of the conventional methods.
https://doi.org/10.5370/JEET.2017.12.1.373 인용 PDF KSCI

Robust Feature Extraction for Voice Activity Detection in Nonstationary Noisy Environments (음성구간검출을 위한 비정상성 잡음에 강인한 특징 추출)

Hong, Jungpyo;Park, Sangjun;Jeong, Sangbae;Hahn, Minsoo
- Phonetics and Speech Sciences
- /
- v.5 no.1
- /
- pp.11-16
- /
- 2013
This paper proposes robust feature extraction for accurate voice activity detection (VAD). VAD is one of the principal modules for speech signal processing such as speech codec, speech enhancement, and speech recognition. Noisy environments contain nonstationary noises causing the accuracy of the VAD to drastically decline because the fluctuation of features in the noise intervals results in increased false alarm rates. In this paper, in order to improve the VAD performance, harmonic-weighted energy is proposed. This feature extraction method focuses on voiced speech intervals and weighted harmonic-to-noise ratios to determine the amount of the harmonicity to frame energy. For performance evaluation, the receiver operating characteristic curves and equal error rate are measured.
https://doi.org/10.13064/KSSS.2013.5.1.011 인용 PDF

Voice Activity Detection Based on Signal Energy and Entropy-difference in Noisy Environments (엔트로피 차와 신호의 에너지에 기반한 잡음환경에서의 음성검출)

Ha, Dong-Gyung;Cho, Seok-Je;Jin, Gang-Gyoo;Shin, Ok-Keun
- Journal of Advanced Marine Engineering and Technology
- /
- v.32 no.5
- /
- pp.768-774
- /
- 2008
In many areas of speech signal processing such as automatic speech recognition and packet based voice communication technique, VAD (voice activity detection) plays an important role in the performance of the overall system. In this paper, we present a new feature parameter for VAD which is the product of energy of the signal and the difference of two types of entropies. For this end, we first define a Mel filter-bank based entropy and calculate its difference from the conventional entropy in frequency domain. The difference is then multiplied by the spectral energy of the signal to yield the final feature parameter which we call PEED (product of energy and entropy difference). Through experiments. we could verify that the proposed VAD parameter is more efficient than the conventional spectral entropy based parameter in various SNRs and noisy environments.
https://doi.org/10.5916/jkosme.2008.32.5.768 인용 PDF KSCI

Search Result 1,134, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)