• Title/Summary/Keyword: 가우시안혼합분포군집

Search Result 11, Processing Time 0.033 seconds

Regionalization using cluster probability model and copula based drought frequency analysis (클러스터 확률 모형에 의한 지역화와 코풀라에 의한 가뭄빈도분석)

  • Azam, Muhammad;Choi, Hyun Su;Kim, Hyeong San;Hwang, Ju Ha;Maeng, Seungjin
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2017.05a
    • /
    • pp.46-46
    • /
    • 2017
  • 지역가뭄빈도분석의 분위산정에 대한 신뢰성은 수문학적으로 균일한 지역으로 구분하기 위해 사용된 장기간의 과거 자료와 분석절차에 의해 결정된다. 그러나 극심한 가뭄은 매우 드물게 발생하며 신뢰 할 수 있는 지역빈도분석을 위한 지속기간이 충분치 않는 경우가 많이 발생한다. 이 외에도 우리나라의 복잡한 지형적 및 기후적 특징은 동질한 지역으로 구분하기 위한 통계적인 처리방법이 필요하였다. 본 연구에서 적용한 지역빈도분석은 여러 지역의 다양한 변수인 수문기상 특성을 분석하여 동질한 지역을 확인하고, 주요 가뭄변수(지속 시간 및 심각도)를 통합 적용하여 각각의 동질한 지역 분위를 추정함으로써 동질한 지역을 구분하는 해결책을 제시하였다. 본 연구에서는 가우시안 혼합 모형(Gaussian Mixture Model)을 기반으로 기반 군집분석 방법을 적용하여 최적의 동질한 지역을 구분하고 그 결과를 우도비검정 및 다른 유효성 검사 지수를 이용해서 확인하였다. 가우시안 혼합 모델에서 산정했던 매개변수를 방향저감 공간으로 표현하기 위해서 가우시안 혼합 모델방향 저감(GMMDR)방법을 적용하였다. 이 변수는 가뭄빈도분석을 위해 다양한 분포와 코풀라(copula) 적합도를 이용하여 추정 비교하였다. 그 결과 우리나라를 4개의 동질한 지역으로 나누게 되었다. 가우시안과 Frank copula를 이용한 Pearson type III(PE3) 분포는 우리나라의 가뭄 기간과 심각도의 공동 분포를 추정하는데 적합한 것으로 나타났다.

  • PDF

Construction of Onion Sentiment Dictionary using Cluster Analysis (군집분석을 이용한 양파 감성사전 구축)

  • Oh, Seungwon;Kim, Min Soo
    • Journal of the Korean Data Analysis Society
    • /
    • v.20 no.6
    • /
    • pp.2917-2932
    • /
    • 2018
  • Many researches are accomplished as a result of the efforts of developing the production predicting model to solve the supply imbalance of onions which are vegetables very closely related to Korean food. But considering the possibility of storing onions, it is very difficult to solve the supply imbalance of onions only with predicting the production. So, this paper's purpose is trying to build a sentiment dictionary to predict the price of onions by using the internet articles which include the informations about the production of onions and various factors of the price, and these articles are very easy to access on our daily lives. Articles about onions are from 2012 to 2016, using TF-IDF for comparing with four kinds of TF-IDFs through the documents classification of wholesale prices of onions. As a result of classifying the positive/negative words for price by k-means clustering, DBSCAN (density based spatial cluster application with noise) clustering, GMM (Gaussian mixture model) clustering which are partitional clustering, GMM clustering is composed with three meaningful dictionaries. To compare the reasonability of these built dictionary, applying classified articles about the rise and drop of the price on logistic regression, and it shows 85.7% accuracy.

Clustering and classification to characterize daily electricity demand (시간단위 전력사용량 시계열 패턴의 군집 및 분류분석)

  • Park, Dain;Yoon, Sanghoo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.2
    • /
    • pp.395-406
    • /
    • 2017
  • The purpose of this study is to identify the pattern of daily electricity demand through clustering and classification. The hourly data was collected by KPS (Korea Power Exchange) between 2008 and 2012. The time trend was eliminated for conducting the pattern of daily electricity demand because electricity demand data is times series data. We have considered k-means clustering, Gaussian mixture model clustering, and functional clustering in order to find the optimal clustering method. The classification analysis was conducted to understand the relationship between external factors, day of the week, holiday, and weather. Data was divided into training data and test data. Training data consisted of external factors and clustered number between 2008 and 2011. Test data was daily data of external factors in 2012. Decision tree, random forest, Support vector machine, and Naive Bayes were used. As a result, Gaussian model based clustering and random forest showed the best prediction performance when the number of cluster was 8.

A Study on the Optimization of State Tying Acoustic Models using Mixture Gaussian Clustering (혼합 가우시안 군집화를 이용한 상태공유 음향모델 최적화)

  • Ann, Tae-Ock
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.6
    • /
    • pp.167-176
    • /
    • 2005
  • This paper describes how the state tying model based on the decision tree which is one of Acoustic models used for speech recognition optimizes the model by reducing the number of mixture Gaussians of the output probability distribution. The state tying modeling uses a finite set of questions which is possible to include the phonological knowledge and the likelihood based decision criteria. And the recognition rate can be improved by increasing the number of mixture Gaussians of the output probability distribution. In this paper, we'll reduce the number of mixture Gaussians at the highest point of recognition rate by clustering the Gaussians. Bhattacharyya and Euclidean method will be used for the distance measure needed when clustering. And after calculating the mean and variance between the pair of lowest distance, the new Gaussians are created. The parameters for the new Gaussians are derived from the parameters of the Gaussians from which it is born. Experiments have been performed using the STOCKNAME (1,680) databases. And the test results show that the proposed method using Bhattacharyya distance measure maintains their recognition rate at $97.2\%$ and reduces the ratio of the number of mixture Gaussians by $1.0\%$. And the method using Euclidean distance measure shows that it maintains the recognition rate at $96.9\%$ and reduces the ratio of the number of mixture Gaussians by $1.0\%$. Then the methods can optimize the state tying model.

Efficient Continuous Vocabulary Clustering Modeling for Tying Model Recognition Performance Improvement (공유모델 인식 성능 향상을 위한 효율적인 연속 어휘 군집화 모델링)

  • Ahn, Chan-Shik;Oh, Sang-Yeob
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.1
    • /
    • pp.177-183
    • /
    • 2010
  • In continuous vocabulary recognition system by statistical method vocabulary recognition to be performed using probability distribution it also modeling using phoneme clustering for based sample probability parameter presume. When vocabulary search that low recognition rate problem happened in express vocabulary result from presumed probability parameter by not defined phoneme and insert phoneme and it has it's bad points of gaussian model the accuracy unsecure for one clustering modeling. To improve suggested probability distribution mixed gaussian model to optimized for based resemble Euclidean and Bhattacharyya distance measurement method mixed clustering modeling that system modeling for be searching phoneme probability model in clustered model. System performance as a result of represent vocabulary dependence recognition rate of 98.63%, vocabulary independence recognition rate of 97.91%.

Nonparametric clustering of functional time series electricity consumption data (전기 사용량 시계열 함수 데이터에 대한 비모수적 군집화)

  • Kim, Jaehee
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.1
    • /
    • pp.149-160
    • /
    • 2019
  • The electricity consumption time series data of 'A' University from July 2016 to June 2017 is analyzed via nonparametric functional data clustering since the time series data can be regarded as realization of continuous functions with dependency structure. We use a Bouveyron and Jacques (Advances in Data Analysis and Classification, 5, 4, 281-300, 2011) method based on model-based functional clustering with an FEM algorithm that assumes a Gaussian distribution on functional principal components. Clusterwise analysis is provided with cluster mean functions, densities and cluster profiles.

Depth Map coding pre-processing using Depth-based Mixed Gaussian Histogram and Mean Shift Filter (깊이정보 기반의 혼합 가우시안 분포 히스토그램과 Mean Shift Filter를 이용한 깊이정보 맵 부호화 전처리)

  • Park, Sung-Hee;Yoo, Ji-Sang
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2010.11a
    • /
    • pp.175-177
    • /
    • 2010
  • 본 논문에서는 MPEG 의 3차원 비디오 시스템의 표준 깊이정보 맵에 대한 효율적인 부호화를 위하여 전처리 방법을 제안한다. 현재 3차원 비디오 부호화(3DVC)에 대한 표준화가 진행 중에 있지만 아직 깊이정보 맵의 부호화 방법에 대한 표준이 확정되지 않은 상태이다. 제안하는 기법에서는 우선, 입력된 깊이정보 맵에 대하여 원래의 히스토그램 분포를 가우시안 혼합모델(GMM)기반의 EM 군집화 기법에 의한 방법으로 분리 후, 분리된 히스토그램을 기반으로 깊이정보 맵을 여러 개의 영상으로 분리한다. 그 후 분리된 각각의 영상을 배경과 객체에 따라 다른 조건의 mean shift filter로 필터링한다. 결과적으로 영상내의 각 영역 경계는 최대한 살리면서 영역내의 화소 값에 대해서는 평균 연산을 취하여 부호화시 효율을 극대화 하고자 하였다. 실험조건은 $1024{\times}768$ 영상에 대해서 50 프레임으로 H.264/AVC base 프로파일로 부호화를 진행하였다. 최종 실험결과 bit rate는 대략 23% ~ 26% 정도 감소하고 부호화 시간도 다소 줄어드는 것을 확인 할 수 있었다.

  • PDF

Decision of Gaussian Function Threshold for Image Segmentation (영상분할을 위한 혼합 가우시안 함수 임계 값 결정)

  • Jung, Yong-Gyu;Choi, Gyoo-Seok;Heo, Go-Eun
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.9 no.5
    • /
    • pp.163-168
    • /
    • 2009
  • Most image segmentation methods are to represent observed feature vectors at each pixel, which are assumed as appropriated probability models. These models can be used by statistical estimating or likelihood clustering algorithms of feature vectors. EM algorithms have some calculation problems of maximum likelihood for unknown parameters from incomplete data and maximum value in post probability distribution. First, the performance is dependent upon starting positions and likelihood functions are converged on local maximum values. To solve these problems, we mixed the Gausian function and histogram at all the level values at the image, which are proposed most suitable image segmentation methods. This proposed algoritms are confirmed to classify most edges clearly and variously, which are implemented to MFC programs.

  • PDF

Depth Map Pre-processing using Gaussian Mixture Model and Mean Shift Filter (혼합 가우시안 모델과 민쉬프트 필터를 이용한 깊이 맵 부호화 전처리 기법)

  • Park, Sung-Hee;Yoo, Ji-Sang
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.5
    • /
    • pp.1155-1163
    • /
    • 2011
  • In this paper, we propose a new pre-processing algorithm applied to depth map to improve the coding efficiency. Now, 3DV/FTV group in the MPEG is working for standard of 3DVC(3D video coding), but compression method for depth map images are not confirmed yet. In the proposed algorithm, after dividing the histogram distribution of a given depth map by EM clustering method based on GMM, we classify the depth map into several layered images. Then, we apply different mean shift filter to each classified image according to the existence of background or foreground in it. In other words, we try to maximize the coding efficiency while keeping the boundary of each object and taking average operation toward inner field of the boundary. The experiments are performed with many test images and the results show that the proposed algorithm achieves bits reduction of 19% ~ 20% and computation time is also reduced.

Quantitative Assessment Technology of Small Animal Myocardial Infarction PET Image Using Gaussian Mixture Model (다중가우시안혼합모델을 이용한 소동물 심근경색 PET 영상의 정량적 평가 기술)

  • Woo, Sang-Keun;Lee, Yong-Jin;Lee, Won-Ho;Kim, Min-Hwan;Park, Ji-Ae;Kim, Jin-Su;Kim, Jong-Guk;Kang, Joo-Hyun;Ji, Young-Hoon;Choi, Chang-Woon;Lim, Sang-Moo;Kim, Kyeong-Min
    • Progress in Medical Physics
    • /
    • v.22 no.1
    • /
    • pp.42-51
    • /
    • 2011
  • Nuclear medicine images (SPECT, PET) were widely used tool for assessment of myocardial viability and perfusion. However it had difficult to define accurate myocardial infarct region. The purpose of this study was to investigate methodological approach for automatic measurement of rat myocardial infarct size using polar map with adaptive threshold. Rat myocardial infarction model was induced by ligation of the left circumflex artery. PET images were obtained after intravenous injection of 37 MBq $^{18}F$-FDG. After 60 min uptake, each animal was scanned for 20 min with ECG gating. PET data were reconstructed using ordered subset expectation maximization (OSEM) 2D. To automatically make the myocardial contour and generate polar map, we used QGS software (Cedars-Sinai Medical Center). The reference infarct size was defined by infarction area percentage of the total left myocardium using TTC staining. We used three threshold methods (predefined threshold, Otsu and Multi Gaussian mixture model; MGMM). Predefined threshold method was commonly used in other studies. We applied threshold value form 10% to 90% in step of 10%. Otsu algorithm calculated threshold with the maximum between class variance. MGMM method estimated the distribution of image intensity using multiple Gaussian mixture models (MGMM2, ${\cdots}$ MGMM5) and calculated adaptive threshold. The infarct size in polar map was calculated as the percentage of lower threshold area in polar map from the total polar map area. The measured infarct size using different threshold methods was evaluated by comparison with reference infarct size. The mean difference between with polar map defect size by predefined thresholds (20%, 30%, and 40%) and reference infarct size were $7.04{\pm}3.44%$, $3.87{\pm}2.09%$ and $2.15{\pm}2.07%$, respectively. Otsu verse reference infarct size was $3.56{\pm}4.16%$. MGMM methods verse reference infarct size was $2.29{\pm}1.94%$. The predefined threshold (30%) showed the smallest mean difference with reference infarct size. However, MGMM was more accurate than predefined threshold in under 10% reference infarct size case (MGMM: 0.006%, predefined threshold: 0.59%). In this study, we was to evaluate myocardial infarct size in polar map using multiple Gaussian mixture model. MGMM method was provide adaptive threshold in each subject and will be a useful for automatic measurement of infarct size.