• Title/Summary/Keyword: Frame-based likelihood

Search Result 27, Processing Time 0.03 seconds

Statistical Model-Based Voice Activity Detection Based on Second-Order Conditional MAP with Soft Decision

  • Chang, Joon-Hyuk
    • ETRI Journal
    • /
    • v.34 no.2
    • /
    • pp.184-189
    • /
    • 2012
  • In this paper, we propose a novel approach to statistical model-based voice activity detection (VAD) that incorporates a second-order conditional maximum a posteriori (CMAP) criterion. As a technical improvement for the first-order CMAP criterion in [1], we consider both the current observation and the voice activity decision in the previous two frames to take full consideration of the interframe correlation of voice activity. This is clearly different from the previous approach [1] in that we employ the voice activity decisions in the second-order (previous two frames) CMAP, which has quadruple thresholds with an additional degree of freedom, rather than the first-order (previous single frame). Also, a soft-decision scheme is incorporated, resulting in time-varying thresholds for further performance improvement. Experimental results show that the proposed algorithm outperforms the conventional CMAP-based VAD technique under various experimental conditions.

Matching Pursuit Sinusoidal Modeling with Damping Factor (Damping 요소를 첨가한 매칭 퍼슈잇 정현파 모델링)

  • Jeong, Gyu-Hyeok;Kim, Jong-Hark;Lim, Joung-Woo;Joo, Gi-Ho;Lee, In-Sung
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.1
    • /
    • pp.105-113
    • /
    • 2007
  • In this paper, we propose the matching pursuit with damping factors, a new sinusoidal model improving the matching pursuit, for the codecs based on sinusoidal model. The proposed model defines damping factors by using a correlativity of parameters between the current and adjacent frame, and estimates sinusoidal parameters more accurately in analysis frame by using the matching pursuit according to damping factor, and synthesizes the final signal. Then it is possible to model efficiently without interpolation schemes. The proposed sinusoidal model shows a better speech quality without an additional delay than the conventional sinusoidal model with interpolation methods. Through the SNR(signal to noise ratio), the MOS(Mean Opinion Score), LR(Itakura-Saito likelihood ratio), and CD(cepstral distance), we compare the performance of our model with that of matching pursuit using interpolation methods.

Theoretical Limits Analysis of Indoor Positioning System Using Visible Light and Image Sensor

  • Zhao, Xiang;Lin, Jiming
    • ETRI Journal
    • /
    • v.38 no.3
    • /
    • pp.560-567
    • /
    • 2016
  • To solve the problem of parameter optimization in image sensor-based visible light positioning systems, theoretical limits for both the location and the azimuth angle of the image sensor receiver (ISR) are calculated. In the case of a typical indoor scenario, maximum likelihood estimations for both the location and the azimuth angle of the ISR are first deduced. The Cramer-Rao Lower Bound (CRLB) is then derived, under the condition that the observation values of the image points are affected by white Gaussian noise. For typical parameters of LEDs and image sensors, simulation results show that accurate estimates for both the location and azimuth angle can be achieved, with positioning errors usually on the order of centimeters and azimuth angle errors being less than $1^{\circ}$. The estimation accuracy depends on the focal length of the lens and on the pixel size and frame rate of the ISR, as well as on the number of transmitters used.

Continuous Speech Recognition based on Parmetric Trajectory Segmental HMM (모수적 궤적 기반의 분절 HMM을 이용한 연속 음성 인식)

  • 윤영선;오영환
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.3
    • /
    • pp.35-44
    • /
    • 2000
  • In this paper, we propose a new trajectory model for characterizing segmental features and their interaction based upon a general framework of hidden Markov models. Each segment, a sequence of vectors, is represented by a trajectory of observed sequences. This trajectory is obtained by applying a new design matrix which includes transitional information on contiguous frames, and is characterized as a polynomial regression function. To apply the trajectory to the segmental HMM, the frame features are replaced with the trajectory of a given segment. We also propose the likelihood of a given segment and the estimation of trajectory parameters. The obervation probability of a given segment is represented as the relation between the segment likelihood and the estimation error of the trajectories. The estimation error of a trajectory is considered as the weight of the likelihood of a given segment in a state. This weight represents the probability of how well the corresponding trajectory characterize the segment. The proposed model can be regarded as a generalization of a conventional HMM and a parametric trajectory model. The experimental results are reported on the TIMIT corpus and performance is show to improve significantly over that of the conventional HMM.

  • PDF

Robust Speech Endpoint Detection in Noisy Environments for HRI (Human-Robot Interface) (인간로봇 상호작용을 위한 잡음환경에 강인한 음성 끝점 검출 기법)

  • Park, Jin-Soo;Ko, Han-Seok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.32 no.2
    • /
    • pp.147-156
    • /
    • 2013
  • In this paper, a new speech endpoint detection method in noisy environments for moving robot platforms is proposed. In the conventional method, the endpoint of speech is obtained by applying an edge detection filter that finds abrupt changes in the feature domain. However, since the feature of the frame energy is unstable in such noisy environments, it is difficult to accurately find the endpoint of speech. Therefore, a novel feature extraction method based on the twice-iterated fast fourier transform (TIFFT) and statistical models of speech is proposed. The proposed feature extraction method was applied to an edge detection filter for effective detection of the endpoint of speech. Representative experiments claim that there was a substantial improvement over the conventional method.

A Speaker Pruning Method for Reducing Calculation Costs of Speaker Identification System (화자식별 시스템의 계산량 감소를 위한 화자 프루닝 방법)

  • 김민정;오세진;정호열;정현열
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.6
    • /
    • pp.457-462
    • /
    • 2003
  • In this paper, we propose a speaker pruning method for real-time processing and improving performance of speaker identification system based on GMM(Gaussian Mixture Model). Conventional speaker identification methods, such as ML (Maximum Likelihood), WMR(weighting Model Rank), and MWMR(Modified WMR) we that frame likelihoods are calculated using the whole frames of each input speech and all of the speaker models and then a speaker having the biggest accumulated likelihood is selected. However, in these methods, calculation cost and processing time become larger as the increase of the number of input frames and speakers. To solve this problem in the proposed method, only a part of speaker models that have higher likelihood are selected using only a part of input frames, and identified speaker is decided from evaluating the selected speaker models. In this method, fm can be applied for improving the identification performance in speaker identification even the number of speakers is changed. In several experiments, the proposed method showed a reduction of 65% on calculation cost and an increase of 2% on identification rate than conventional methods. These results means that the proposed method can be applied effectively for a real-time processing and for improvement of performance in speaker identification.

Graph-Based framework for Global Registration (그래프에 기반한 전역적 정합 방법)

  • 김현우;홍기상
    • Proceedings of the IEEK Conference
    • /
    • 2000.09a
    • /
    • pp.671-674
    • /
    • 2000
  • In this paper, we present a robust global registration algorithm for multi-frame image mosaics. When we perform a pair-wise registration recovering a projective transformation between two consecutive frames, severe mis-registration among multiple frames, which are not consecutive, can be detected. It is because the concatenation of those pair-wise transformations leads to global alignment errors. To overcome those mis-registrations, we propose a new algorithm using multiple frames for constructing image mosaics. We use a graph to represent the temporal and spatial connectivity and show that global registration can be obtained through the search for an optimal path in the constructed graph. The definition of an adequate objective function characterizing the global registration provides a direct manipulation of the graph. In the presence of moving objects, especially large ones compared with low texture backgrounds, by using the likelihood ratio as the objective function, we can deal with some of the most challenging videos like basketball or soccer Moreover, the algorithm can be parallelized so it can be more efficiently implemented. Finally, we give some experimental results from real videos.

  • PDF

A Scheme to Increase Throughput in Framed-ALOHA-Based RFID Systems with Capture

  • Oh, Sung-Youl;Jung, Sung-Hwan;Hong, Jung-Wan;Lie, Chang-Hoon
    • ETRI Journal
    • /
    • v.30 no.3
    • /
    • pp.486-488
    • /
    • 2008
  • In this paper, a scheme to increase the throughput of RFID systems is presented, which considers the capture effect in the context of framed ALOHA protocol. Under the capture model in which the probability of one tag is identified successfully depending on the number of tags involved in the collision, two probabilistic methods for estimating the unknown number of tags are proposed. The first method is the maximum likelihood estimation method, and the second method is an approximate algorithm for reducing the computational time. The optimal frame size condition to maximize the system throughput by considering the capture effect is also presented.

  • PDF

Adaptive Iteration Schemes for Iterative Receivers in MIMO Systems (다중 안테나 반복 수신 시스템에서의 적응형 반복 결정 방법에 관한 연구)

  • Noh, Jeehwan;Kwon, Dongseung;Lee, Chungyong
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.5
    • /
    • pp.3-8
    • /
    • 2013
  • We consider some adaptive iteration schemes that provide lower complexity of the iterative receiver by reducing unnecessary iterations. While conventional iterative receiver considers only fixed number of iterations, we apply adaptive iteration schemes, taking into account quality of the received frame. Based on simulation results, proposed schemes reduce average number of iterations while maintaining BER performance compared to the conventional scheme.

Reliability-Based Design Optimization Using Akaike Information Criterion for Discrete Information (이산정보의 아카이케 정보척도를 이용한 신뢰성 기반 최적설계)

  • Lim, Woo-Chul;Lee, Tae-Hee
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.36 no.8
    • /
    • pp.921-927
    • /
    • 2012
  • Reliability-based design optimization (RBDO) can be used to determine the reliability of a system by means of probabilistic design criteria, i.e., the possibility of failure considering stochastic features of design variables and input parameters. To assure these criteria, various reliability analysis methods have been developed. Most of these methods assume that distribution functions are continuous. However, in real problems, because real data is often discrete in form, it is important to estimate the distributions for discrete information during reliability analysis. In this study, we employ the Akaike information criterion (AIC) method for reliability analysis to determine the best estimated distribution for discrete information and we suggest an RBDO method using AIC. Mathematical and engineering examples are illustrated to verify the proposed method.