• Title/Summary/Keyword: Frame-based likelihood

Search Result 27, Processing Time 0.026 seconds

A Study on the Context-dependent Speaker Recognition Adopting the Method of Weighting the Frame-based Likelihood Using SNR (SNR을 이용한 프레임별 유사도 가중방법을 적용한 문맥종속 화자인식에 관한 연구)

  • Choi, Hong-Sub
    • MALSORI
    • /
    • no.61
    • /
    • pp.113-123
    • /
    • 2007
  • The environmental differences between training and testing mode are generally considered to be the critical factor for the performance degradation in speaker recognition systems. Especially, general speaker recognition systems try to get as clean speech as possible to train the speaker model, but it's not true in real testing phase due to environmental and channel noise. So in this paper, the new method of weighting the frame-based likelihood according to frame SNR is proposed in order to cope with that problem. That is to make use of the deep correlation between speech SNR and speaker discrimination rate. To verify the usefulness of this proposed method, it is applied to the context dependent speaker identification system. And the experimental results with the cellular phone speech DB which is designed by ETRI for Koran speaker recognition show that the proposed method is effective and increase the identification accuracy by 11% at maximum.

  • PDF

Hybrid Method using Frame Selection and Weighting Model Rank to improve Performance of Real-time Text-Independent Speaker Recognition System based on GMM (GMM 기반 실시간 문맥독립화자식별시스템의 성능향상을 위한 프레임선택 및 가중치를 이용한 Hybrid 방법)

  • 김민정;석수영;김광수;정호열;정현열
    • Journal of Korea Multimedia Society
    • /
    • v.5 no.5
    • /
    • pp.512-522
    • /
    • 2002
  • In this paper, we propose a hybrid method which is mixed with frame selection and weighting model rank method, based on GMM(gaussian mixture model), for real-time text-independent speaker recognition system. In the system, maximum likelihood estimation was used for GMM parameter optimization, and maximum likelihood was used for recognition basically Proposed hybrid method has two steps. First, likelihood score was calculated with speaker models and test data at frame level, and the difference is calculated between the biggest likelihood value and second. And then, the frame is selected if the difference is bigger than threshold. The second, instead of calculated likelihood, weighting value is used for calculating total score at each selected frame. Cepstrum coefficient and regressive coefficient were used as feature parameters, and the database for test and training consists of several data which are collected at different time, and data for experience are selected randomly In experiments, we applied each method to baseline system, and tested. In speaker recognition experiments, proposed hybrid method has an average of 4% higher recognition accuracy than frame selection method and 1% higher than W method, implying the effectiveness of it.

  • PDF

Voice Activity Detection Based on Discriminative Weight Training with Feedback (궤환구조를 가지는 변별적 가중치 학습에 기반한 음성검출기)

  • Kang, Sang-Ick;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.8
    • /
    • pp.443-449
    • /
    • 2008
  • One of the key issues in practical speech processing is to achieve robust Voice Activity Deteciton (VAD) against the background noise. Most of the statistical model-based approaches have tried to employ equally weighted likelihood ratios (LRs), which, however, deviates from the real observation. Furthermore voice activities in the adjacent frames have strong correlation. In other words, the current frame is highly correlated with previous frame. In this paper, we propose the effective VAD approach based on a minimum classification error (MCE) method which is different from the previous works in that different weights are assigned to both the likelihood ratio on the current frame and the decision statistics of the previous frame.

Performance Analysis Based On Log-Likelihood Ratio in Orthogonal Code Hopping Multiplexing Systems Using Multiple Antennas (다중 안테나를 사용한 직교 부호 도약 다중화 시스템에서 로그 우도비 기반 성능 분석)

  • Jung, Bang-Chul;Sung, Kil-Young;Shin, Won-Yong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.12
    • /
    • pp.2534-2542
    • /
    • 2011
  • In this paper, we show that performance can be improved by using multiple antennas in the conventional orthogonal code hopping multiplexing (OCHM) scheme, which was proposed for accommodating a larger number of users with low channel activities than the number of orthogonal codewords used in code division multiple access (CDMA)-based communication systems through downlink statistical multiplexing. First, we introduce two different types of OCHM systems together with orthogonal codeword allocation strategies, and then derive their mathematical expression for log-likelihood ratio (LLR) values according to the two different schemes. Next, when a turbo encoder based on the LLR computation is used, we evaluate performance on the frame error rate (FER) for the aformentioned OCHM system. For comparison, we also show performance for the existing symbol mapping method using multiple antennas, which was used in 3GPP standards. As a result, it is shown that our OCHM system with multiple antennas based on the proposed orthogonal codeword allocation strategy leads to performance gain over the conventional system---energy required to satisfy a target FER is significantly reduced.

Seismic fragility analysis of wood frame building in hilly region

  • Ghosh, Swarup;Chakraborty, Subrata
    • Earthquakes and Structures
    • /
    • v.20 no.1
    • /
    • pp.97-107
    • /
    • 2021
  • A comprehensive study on seismic performance of wood frame building in hilly regions is presented. Specifically, seismic fragility assessment of a typical wood frame building at various locations of the northeast region of India are demonstrated. A three-dimensional simplified model of the wood frame building is developed with due consideration to nonlinear behaviour of shear walls under lateral loads. In doing so, a trilinear model having improved capability to capture the force-deformation behaviour of shear walls including the strength degradation at higher deformations is proposed. The improved capability of the proposed model to capture the force-deformation behaviour of shear wall is validated by comparing with the existing experimental results. The structural demand values are obtained from nonlinear time history analysis (NLTHA) of the three-dimensional wood frame model considering the effect of uncertainty due to record to record variation of ground motions and structural parameters as well. The ground motion bins necessary for NLTHA are prepared based on the identified hazard level from probabilistic seismic hazard analysis of the considered locations. The maximum likelihood estimates of the lognormal fragility parameters are obtained from the observed failure cases and the seismic fragilities corresponding to different locations are estimated accordingly. The results of the numerical study show that the wood frame constructions commonly found in the region are likely to suffer minor cracking or damage in the shear walls under the earthquake occurrence corresponding to the estimated seismic hazard level; however, poses negligible risk against complete collapse of such structures.

Retrospective Maximum Likelihood Decision Rule for Tag Cognizance in RFID Networks (RFID 망에서 Tag 인식을 위한 회고풍의 최대 우도 결정 규칙)

  • Kim, Joon-Mo;Park, Jin-Kyung;Ha, Jun;Seo, Hee-Won;Choi, Cheon-Won
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.48 no.2
    • /
    • pp.21-28
    • /
    • 2011
  • We consider an RFID network configured as a star in which tags stationarily move into and out of the vicinity of the reader. To cognize the neighboring tags in the RFID network, we propose a scheme based on dynamic framed and slotted ALOHA which determines the number of slots belonging to a frame in a dynamic fashion. The tag cognizance scheme distinctively employs a rule for estimating the expected number of neighboring tags, identified as R-retrospective maximum likelihood rule, where the observations attained in the R previous frames are used in maximizing the likelihood of expected number of tags. Simulation result shows that a slight increase in depth of retrospect is able to significantly improve the cognizance performance.

Frame Reliability Weighting for Robust Speech Recognition (프레임 신뢰도 가중에 의한 강인한 음성인식)

  • 조훈영;김락용;오영환
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.3
    • /
    • pp.323-329
    • /
    • 2002
  • This paper proposes a frame reliability weighting method to compensate for a time-selective noise that occurs at random positions of speech signal contaminating certain parts of the speech signal. Speech frames have different degrees of reliability and the reliability is proportional to SNR (signal-to noise ratio). While it is feasible to estimate frame Sl? by using the noise information from non-speech interval under a stationary noisy situation, it is difficult to obtain noise spectrum for a time-selective noise. Therefore, we used statistical models of clean speech for the estimation of the frame reliability. The proposed MFR (model-based frame reliability) approximates frame SNR values using filterbank energy vectors that are obtained by the inverse transformation of input MFCC (mal-frequency cepstral coefficient) vectors and mean vectors of a reference model. Experiments on various burnt noises revealed that the proposed method could represent the frame reliability effectively. We could improve the recognition performance by using MFR values as weighting factors at the likelihood calculation step.

Video-based Height Measurements of Multiple Moving Objects

  • Jiang, Mingxin;Wang, Hongyu;Qiu, Tianshuang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.9
    • /
    • pp.3196-3210
    • /
    • 2014
  • This paper presents a novel video metrology approach based on robust tracking. From videos acquired by an uncalibrated stationary camera, the foreground likelihood map is obtained by using the Codebook background modeling algorithm, and the multiple moving objects are tracked by a combined tracking algorithm. Then, we compute vanishing line of the ground plane and the vertical vanishing point of the scene, and extract the head feature points and the feet feature points in each frame of video sequences. Finally, we apply a single view mensuration algorithm to each of the frames to obtain height measurements and fuse the multi-frame measurements using RANSAC algorithm. Compared with other popular methods, our proposed algorithm does not require calibrating the camera, and can track the multiple moving objects when occlusion occurs. Therefore, it reduces the complexity of calculation and improves the accuracy of measurement simultaneously. The experimental results demonstrate that our method is effective and robust to occlusion.

A Statistical Model-Based Voice Activity Detection Employing the Conditional MAP Criterion with Spectral Deviation (조건 사후 최대 확률과 음성 스펙트럼 변이 조건을 이용한 통계적 모델 기반의 음성 검출기)

  • Kim, Sang-Kyun;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.30 no.6
    • /
    • pp.324-329
    • /
    • 2011
  • In this paper, we propose a novel approach to improve the performance of a statistical model-based voice activity detection (VAD) which is based on the conditional maximum a posteriori (CMAP) with deviation. In our approach, the VAD decision rule is expressed as the geometric mean of likelihood ratios (LRs) based on adapted threshold according to the speech presence probability conditioned on both the speech activity decisions and spectral deviation in the pervious frame. Experimental results show that the proposed approach yields better results compared to the CMAP-based VAD using the LR test.

Discriminative Training of Sequence Taggers via Local Feature Matching

  • Kim, Minyoung
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.14 no.3
    • /
    • pp.209-215
    • /
    • 2014
  • Sequence tagging is the task of predicting frame-wise labels for a given input sequence and has important applications to diverse domains. Conventional methods such as maximum likelihood (ML) learning matches global features in empirical and model distributions, rather than local features, which directly translates into frame-wise prediction errors. Recent probabilistic sequence models such as conditional random fields (CRFs) have achieved great success in a variety of situations. In this paper, we introduce a novel discriminative CRF learning algorithm to minimize local feature mismatches. Unlike overall data fitting originating from global feature matching in ML learning, our approach reduces the total error over all frames in a sequence. We also provide an efficient gradient-based learning method via gradient forward-backward recursion, which requires the same computational complexity as ML learning. For several real-world sequence tagging problems, we empirically demonstrate that the proposed learning algorithm achieves significantly more accurate prediction performance than standard estimators.