• Title/Summary/Keyword: markov models

Search Result 490, Processing Time 0.024 seconds

Design of a MapReduce-Based Mobility Pattern Mining System for Next Place Prediction (다음 장소 예측을 위한 맵리듀스 기반의 이동 패턴 마이닝 시스템 설계)

  • Kim, Jongwhan;Lee, Seokjun;Kim, Incheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.8
    • /
    • pp.321-328
    • /
    • 2014
  • In this paper, we present a MapReduce-based mobility pattern mining system which can predict efficiently the next place of mobile users. It learns the mobility pattern model of each user, represented by Hidden Markov Models(HMM), from a large-scale trajectory dataset, and then predicts the next place for the user to visit by applying the learned models to the current trajectory. Our system consists of two parts: the back-end part, in which the mobility pattern models are learned for individual users, and the front-end part, where the next place for a certain user to visit is predicted based on the mobility pattern models. While the back-end part comprises of three distinct MapReduce modules for POI extraction, trajectory transformation, and mobility pattern model learning, the front-end part has two different modules for candidate route generation and next place prediction. Map and reduce functions of each module in our system were designed to utilize the underlying Hadoop infrastructure enough to maximize the parallel processing. We performed experiments to evaluate the performance of the proposed system by using a large-scale open benchmark dataset, GeoLife, and then could make sure of high performance of our system as results of the experiments.

Bayesian logit models with auxiliary mixture sampling for analyzing diabetes diagnosis data (보조 혼합 샘플링을 이용한 베이지안 로지스틱 회귀모형 : 당뇨병 자료에 적용 및 분류에서의 성능 비교)

  • Rhee, Eun Hee;Hwang, Beom Seuk
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.1
    • /
    • pp.131-146
    • /
    • 2022
  • Logit models are commonly used to predicting and classifying categorical response variables. Most Bayesian approaches to logit models are implemented based on the Metropolis-Hastings algorithm. However, the algorithm has disadvantages of slow convergence and difficulty in ensuring adequacy for the proposal distribution. Therefore, we use auxiliary mixture sampler proposed by Frühwirth-Schnatter and Frühwirth (2007) to estimate logit models. This method introduces two sequences of auxiliary latent variables to make logit models satisfy normality and linearity. As a result, the method leads that logit model can be easily implemented by Gibbs sampling. We applied the proposed method to diabetes data from the Community Health Survey (2020) of the Korea Disease Control and Prevention Agency and compared performance with Metropolis-Hastings algorithm. In addition, we showed that the logit model using auxiliary mixture sampling has a great classification performance comparable to that of the machine learning models.

Ontology-based Automated Metadata Generation Considering Semantic Ambiguity (의미 중의성을 고려한 온톨로지 기반 메타데이타의 자동 생성)

  • Choi, Jung-Hwa;Park, Young-Tack
    • Journal of KIISE:Software and Applications
    • /
    • v.33 no.11
    • /
    • pp.986-998
    • /
    • 2006
  • There has been an increasing necessity of Semantic Web-based metadata that helps computers efficiently understand and manage an information increased with the growth of Internet. However, it seems inevitable to face some semantically ambiguous information when metadata is generated. Therefore, we need a solution to this problem. This paper proposes a new method for automated metadata generation with the help of a concept of class, in which some ambiguous words imbedded in information such as documents are semantically more related to others, by using probability model of consequent words. We considers ambiguities among defined concepts in ontology and uses the Hidden Markov Model to be aware of part of a named entity. First of all, we constrict a Markov Models a better understanding of the named entity of each class defined in ontology. Next, we generate the appropriate context from a text to understand the meaning of a semantically ambiguous word and solve the problem of ambiguities during generating metadata by searching the optimized the Markov Model corresponding to the sequence of words included in the context. We experiment with seven semantically ambiguous words that are extracted from computer science thesis. The experimental result demonstrates successful performance, the accuracy improved by about 18%, compared with SemTag, which has been known as an effective application for assigning a specific meaning to an ambiguous word based on its context.

Analysis of Hydrological Impact Using Climate Change Scenarios and the CA-Markov Technique on Soyanggang-dam Watershed (CA-Markov 기법을 이용한 기후변화에 따른 소양강댐 유역의 수문분석)

  • Lim, Hyuk-Jin;Kwon, Hyung-Joong;Bae, Deg-Hyo;Kim, Seong-Joon
    • Journal of Korea Water Resources Association
    • /
    • v.39 no.5 s.166
    • /
    • pp.453-466
    • /
    • 2006
  • The objective of this study was to analyze the changes in the hydrological environment in Soyanggang-dam watershed due to climate change results (in yews 2050 and 2100) which were simulated using CCCma CGCM2 based on SRES A2 and B2. The SRES A2 and B2 were used to estimate NDVI values for selected land use using the relation of NDVI-Temperature using linear regression of observed data (in years 1998$\sim$2002). Land use change based on SRES A2 and B2 was estimated every 5- and 10-year period using the CA-Markov technique based on the 1985, 1990, 1995 and 2000 land cover map classified by Landsat TM satellite images. As a result, the trend in land use change in each land class was reflected. When land use changes in years 2050 and 2100 were simulated using the CA-Markov method, the forest class area declined while the urban, bareground and grassland classes increased. When simulation was done further for future scenarios, the transition change converged and no increasing trend was reflected. The impact assessment of evapotranspiration was conducted by comparing the observed data with the computed results based on three cases supposition scenarios of meteorological data (temperature, global radiation and wind speed) using the FAO Penman-Monteith method. The results showed that the runoff was reduced by about 50% compared with the present hydrologic condition when each SRES and periods were compared. If there was no land use change, the runoff would decline further to about 3$\sim$5%.

A Study-on Context-Dependent Acoustic Models to Improve the Performance of the Korea Speech Recognition (한국어 음성인식 성능향상을 위한 문맥의존 음향모델에 관한 연구)

  • 황철준;오세진;김범국;정호열;정현열
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.2 no.4
    • /
    • pp.9-15
    • /
    • 2001
  • In this paper we investigate context dependent acoustic models to improve the performance of the Korean speech recognition . The algorithm are using the Korean phonological rules and decision tree, By Successive State Splitting(SSS) algorithm the Hidden Merkov Netwwork(HM-Net) which is an efficient representation of phoneme-context-dependent HMMs, can be generated automatically SSS is powerful technique to design topologies of tied-state HMMs but it doesn't treat unknown contexts in the training phoneme contexts environment adequately In addition it has some problem in the procedure of the contextual domain. In this paper we adopt a new state-clustering algorithm of SSS, called Phonetic Decision Tree-based SSS (PDT-SSS) which includes contexts splits based on the Korean phonological rules. This method combines advantages of both the decision tree clustering and SSS, and can generated highly accurate HM-Net that can express any contexts To verify the effectiveness of the adopted methods. the experiments are carried out using KLE 452 word database and YNU 200 sentence database. Through the Korean phoneme word and sentence recognition experiments. we proved that the new state-clustering algorithm produce better phoneme, word and continuous speech recognition accuracy than the conventional HMMs.

  • PDF

Deterioration Prediction Model of Water Pipes Using Fuzzy Techniques (퍼지기법을 이용한 상수관로의 노후도예측 모델 연구)

  • Choi, Taeho;Choi, Min-ah;Lee, Hyundong;Koo, Jayong
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.30 no.2
    • /
    • pp.155-165
    • /
    • 2016
  • Pipe Deterioration Prediction (PDP) and Pipe Failure Risk Prediction (PFRP) models were developed in an attempt to predict the deterioration and failure risk in water mains using fuzzy technique and the markov process. These two models were used to determine the priority in repair and replacement, by predicting the deterioration degree, deterioration rate, failure possibility and remaining life in a study sample comprising 32 water mains. From an analysis approach based on conservative risk with a medium policy risk, the remaining life for 30 of the 32 water mains was less than 5 years for 2 mains (7%), 5-10 years for 8 (27%), 10-15 years for 7 (23%), 15-20 years for 5 (17%), 20-25 years for 5 (17%), and 25 years or more for 2 (7%).

Reliability Analysis Modeling of Communication Networks Considering Rerouting (재경로 설정을 고려한 통신망의 신뢰도 분석 모델링)

  • Ro, Cheul-Woo
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.1
    • /
    • pp.45-52
    • /
    • 2009
  • In this paper, we develop queueing network models of communication networks with reliability model considering link failures. The reliability of a communication network with a virtual connection exposed to link failures is analyzed. Stochastic Reward Nets (SRN) is an extension of stochastic Petri nets and provides compact modeling facilities for system analysis. To get the performance index, appropriate reward rates are assigned to its SRN. It is shown that SRN modeling is well suited to specify, automatically generate and solve for reliability under rerouting. Markov models using SRN are developed and solved to depict various rerouting caused by link failures and reliability analysis in communication networks.

HMM-Based Human Gait Recognition (HMM을 이용한 보행자 인식)

  • Sin Bong-Kee;Suk Heung-Il
    • Journal of KIISE:Software and Applications
    • /
    • v.33 no.5
    • /
    • pp.499-507
    • /
    • 2006
  • Recently human gait has been considered as a useful biometric supporting high performance human identification systems. This paper proposes a view-based pedestrian identification method using the dynamic silhouettes of a human body modeled with the Hidden Markov Model(HMM). Two types of gait models have been developed both with an endless cycle architecture: one is a discrete HMM method using a self-organizing map-based VQ codebook and the other is a continuous HMM method using feature vectors transformed into a PCA space. Experimental results showed a consistent performance trend over a range of model parameters and the recognition rate up to 88.1%. Compared with other methods, the proposed models and techniques are believed to have a sufficient potential for a successful application to gait recognition.

A Study on Word Juncture Modeling for Continuous Speech Recognition of Korean Language (한국어 연속음성 인식을 위한 단어 결합 모델링에 관한 연구)

  • Choi, In-Jeong;Un, Chong-Kwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.13 no.5
    • /
    • pp.24-31
    • /
    • 1994
  • In this paper, we study continuous speech recognition of Korean language using acoustic models of word juncture coarticulation. To alleviate the performance degradation due to coarticulation problems, we use context-dependent units that model inter-word transitions in addition to intra-word transitions. In all cases the initial phone of each word has to be specified for each possible final phone of the previous word similarly for the final phone of each word. To improve the robustness of the HMM parameters, the covariance matrix is smoothed. We also use position-dependent units to improve the discriminative power between units. Simulation results show that when the improved models of word juncture coarticulation are used. the recognition performance is considerably improved compared to the baseline system using only intra-word units.

  • PDF

Bayesian Variable Selection in the Proportional Hazard Model with Application to Microarray Data

  • Lee, Kyeong-Eun;Mallick, Bani K.
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2005.05a
    • /
    • pp.17-23
    • /
    • 2005
  • In this paper we consider the well-known semiparametric proportional hazards models for survival analysis. These models are usually used with few covariates and many observations (subjects). But, for a typical setting of gene expression data from DNA microarray, we need to consider the case where the number of covariates p exceeds the number of samples n. For a given vector of response values which are times to event (death or censored times) and p gene expressions(covariates), we address the issue of how to reduce the dimension by selecting the significant genes. This approach enables us to estimate the survival curve when n ${\ll}$p. In our approach, rather than fixing the number of selected genes, we will assign a prior distribution to this number. The approach creates additional flexibility by allowing the imposition of constraints, such as bounding the dimension via a prior, which in effect works as a penalty To implement our methodology, we use a Markov Chain Monte Carlo (MCMC) method. We demonstrate the use of the methodology to diffuse large B-cell lymphoma (DLBCL) complementary DNA (cDNA) data and Breast Carcinomas data.

  • PDF