• Title/Summary/Keyword: statistical clustering method

Search Result 231, Processing Time 0.023 seconds

An Alternative Method for Assessing Local Spatial Association Among Inter-paired Location Events: Vector Spatial Autocorrelation in Housing Transactions (쌍대위치 이벤트들의 국지적 공간적 연관성을 평가하기 위한 방법론적 연구: 주택거래의 벡터 공간적 자기상관)

  • Lee, Gun-Hak
    • Journal of the Economic Geographical Society of Korea
    • /
    • v.11 no.4
    • /
    • pp.564-579
    • /
    • 2008
  • It is often challenging to evaluate local spatial association among onedimensional vectors generally representing paired-location events where two points are physically or functionally connected. This is largely because of complex process of such geographic phenomena itself and partially representational complexity. This paper addresses an alternative way to identify spatially autocorrelated paired-location events (or vectors) at a local scale. In doing so, we propose a statistical algorithm combining univariate point pattern analysis for evaluating local clustering of origin-points and similarity measure of corresponding vectors. For practical use of the suggested method, we present an empirical application using transactions data in a local housing market, particularly recorded from 2004 to 2006 in Franklin County, Ohio in the United States. As a result, several locally characterized similar transactions are identified among a set of vectors showing various local moves associated with communities defined.

  • PDF

Performance Improvement of Human Detection in Thermal Images using Principal Component Analysis and Blob Clustering (주성분 분석과 Blob 군집화를 이용한 열화상 사람 검출 시스템의 성능 향상)

  • Jo, Ahra;Park, Jeong-Sik;Seo, Yong-Ho;Jang, Gil-Jin
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.13 no.2
    • /
    • pp.157-163
    • /
    • 2013
  • In this paper, we propose a human detection technique using thermal imaging camera. The proposed method is useful at night or rainy weather where a visible light imaging cameras is not able to detect human activities. Under the observation that a human is usually brighter than the background in the thermal images, we estimate the preliminary human regions using the statistical confidence measures in the gray-level, brightness histogram. Afterwards, we applied Gaussian filtering and blob labeling techniques to remove the unwanted noise, and gather the scattered of the pixel distributions and the center of gravities of the blobs. In the final step, we exploit the aspect ratio and the area on the unified object region as well as a number of the principal components extracted from the object region images to determine if the detected object is a human. The experimental results show that the proposed method is effective in environments where visible light cameras are not applicable.

A Study on the Relationship between Skill and Competition Score Factors of KLPGA Players Using Canonical Correlation Biplot and Cluster Analysis (정준상관 행렬도와 군집분석을 응용한 KLPGA 선수의 기술과 경기성적요인에 대한 연관성 분석)

  • Choi, Tae-Hoon;Choi, Yong-Seok
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.3
    • /
    • pp.429-439
    • /
    • 2008
  • Canonical correlation biplot is 2-dimensional plot for investigating the relationship between two sets of variables and the relationship between observations and variables in canonical correlation analysis graphically. In general, biplot is useful for giving a graphical description of the data. However, this general biplot and also canonical correlation biplot do not give some concise interpretations between variables and observations when the number of observations are large. Recently, for overcoming this problem, Choi and Kim (2008) suggested a method to interpret the biplot analysis by applying the K-means clustering analysis. Therefore, in this study, we will apply their method for investigating the relationship between skill and competition score factors of KLPGA players using canonical correlation biplot and cluster analysis.

Improving Clustering-Based Background Modeling Techniques Using Markov Random Fields (클러스터링과 마르코프 랜덤 필드를 이용한 배경 모델링 기법 제안)

  • Hahn, Hee-Il;Park, Soo-Bin
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.1
    • /
    • pp.157-165
    • /
    • 2011
  • It is challenging to detect foreground objects when background includes an illumination variation, shadow or structural variation due to its motion. Basically pixel-based background models including codebook-based modeling suffer from statistical randomness of each pixel. This paper proposes an algorithm that incorporates Markov random field model into pixel-based background modeling to achieve more accurate foreground detection. Under the assumptions the distance between the pixel on the input imaging and the corresponding background model and the difference between the scene estimates of the spatio-temporally neighboring pixels are exponentially distributed, a recursive approach for estimating the MRF regularizing parameters is proposed. The proposed method alternates between estimating the parameters with the intermediate foreground detection and estimating the foreground detection with the estimated parameters, after computing it with random initial parameters. Extensive experiment is conducted with several videos recorded both indoors and outdoors to compare the proposed method with the standard codebook-based algorithm.

Threshold heterogeneous autoregressive modeling for realized volatility (임계 HAR 모형을 이용한 실현 변동성 분석)

  • Sein Moon;Minsu Park;Changryong Baek
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.4
    • /
    • pp.295-307
    • /
    • 2023
  • The heterogeneous autoregressive (HAR) model is a simple linear model that is commonly used to explain long memory in the realized volatility. However, as realized volatility has more complicated features such as conditional heteroscedasticity, leverage effect, and volatility clustering, it is necessary to extend the simple HAR model. Therefore, to better incorporate the stylized facts, we propose a threshold HAR model with GARCH errors, namely the THAR-GARCH model. That is, the THAR-GARCH model is a nonlinear model whose coefficients vary according to a threshold value, and the conditional heteroscedasticity is explained through the GARCH errors. Model parameters are estimated using an iterative weighted least squares estimation method. Our simulation study supports the consistency of the iterative estimation method. In addition, we show that the proposed THAR-GARCH model has better forecasting power by applying to the realized volatility of major 21 stock indices around the world.

A Classified Space VQ Design for Text-Independent Speaker Recognition (문맥 독립 화자인식을 위한 공간 분할 벡터 양자기 설계)

  • Lim, Dong-Chul;Lee, Hanig-Sei
    • The KIPS Transactions:PartB
    • /
    • v.10B no.6
    • /
    • pp.673-680
    • /
    • 2003
  • In this paper, we study the enhancement of VQ (Vector Quantization) design for text independent speaker recognition. In a concrete way, we present a non-iterative method which makes a vector quantization codebook and this method performs non-iterative learning so that the computational complexity is epochally reduced The proposed Classified Space VQ (CSVQ) design method for text Independent speaker recognition is generalized from Semi-noniterative VQ design method for text dependent speaker recognition. CSVQ contrasts with the existing desiEn method which uses the iterative learninE algorithm for every traininE speaker. The characteristics of a CSVQ design is as follows. First, the proposed method performs the non-iterative learning by using a Classified Space Codebook. Second, a quantization region of each speaker is equivalent for the quantization region of a Classified Space Codebook. And the quantization point of each speaker is the optimal point for the statistical distribution of each speaker in a quantization region of a Classified Space Codebook. Third, Classified Space Codebook (CSC) is constructed through Sample Vector Formation Method (CSVQ1, 2) and Hyper-Lattice Formation Method (CSVQ 3). In the numerical experiment, we use the 12th met-cepstrum feature vectors of 10 speakers and compare it with the existing method, changing the codebook size from 16 to 128 for each Classified Space Codebook. The recognition rate of the proposed method is 100% for CSVQ1, 2. It is equal to the recognition rate of the existing method. Therefore the proposed CSVQ design method is, reducing computational complexity and maintaining the recognition rate, new alternative proposal and CSVQ with CSC can be applied to a general purpose recognition.

A Semi-Noniterative VQ Design Algorithm for Text Dependent Speaker Recognition (문맥종속 화자인식을 위한 준비반복 벡터 양자기 설계 알고리즘)

  • Lim, Dong-Chul;Lee, Haing-Sei
    • The KIPS Transactions:PartB
    • /
    • v.10B no.1
    • /
    • pp.67-72
    • /
    • 2003
  • In this paper, we study the enhancement of VQ (Vector Quantization) design for text dependent speaker recognition. In a concrete way, we present the non-Iterative method which makes a vector quantization codebook and this method Is nut Iterative learning so that the computational complexity is epochally reduced. The proposed semi-noniterative VQ design method contrasts with the existing design method which uses the iterative learning algorithm for every training speaker. The characteristics of a semi-noniterative VQ design is as follows. First, the proposed method performs the iterative learning only for the reference speaker, but the existing method performs the iterative learning for every speaker. Second, the quantization region of the non-reference speaker is equivalent for a quantization region of the reference speaker. And the quantization point of the non-reference speaker is the optimal point for the statistical distribution of the non-reference speaker In the numerical experiment, we use the 12th met-cepstrum feature vectors of 20 speakers and compare it with the existing method, changing the codebook size from 2 to 32. The recognition rate of the proposed method is 100% for suitable codebook size and adequate training data. It is equal to the recognition rate of the existing method. Therefore the proposed semi-noniterative VQ design method is, reducing computational complexity and maintaining the recognition rate, new alternative proposal.

Analysis of Area Type Classification of Seoul Using Geodemographics Methods (Geodemographics의 연구기법을 활용한 서울시 지역유형 분석 연구)

  • Woo, Hyun-Jee;Kim, Young-Hoon
    • Journal of the Korean association of regional geographers
    • /
    • v.15 no.4
    • /
    • pp.510-523
    • /
    • 2009
  • Geodemographics(GD) can be defined as an analytical approach of socio-economic and behavioral data about people to investigate geographical patterns. GD is based on the assumptions that demographical and behavioral characteristics of people who live in the same neighborhood are similar and then the neighborhoods can be categorized with spatial classifications with the geographical classifications. Thus, this paper, in order to identify the applicability of the geographical classification of the GD, explores the concepts of the geodemographics into Seoul city areas with Korea census data sets that contain key characteristics of demographic profiles in the area. Then, this paper attempt to explain each area classification profile by using clustering techniques with Ward's and k-means statistical methods. For this as as as, this paper employs 2005 Census dataset released by Korea National Statistics Office and the neighborhood unit is based on Dong level, the smallest administrative boundary unit in Korea. After selecting and standardizing variables, several areas are categorized by the cluster techniques into 13, this paps as distinctive cluster profiles. These cluster profiles are used to cthite a short description and expand on the cluster names. Finally, the results of the classification propose a reasonable judgement for target area types which benefits for the people who make a spatial decision for their spatial problem-solving.

  • PDF

A Spatial Statistical Approach to Residential Differentiation (I): Developing a Spatial Separation Measure (거주지 분화에 대한 공간통계학적 접근 (I): 공간 분리성 측도의 개발)

  • Lee, Sang-Il
    • Journal of the Korean Geographical Society
    • /
    • v.42 no.4
    • /
    • pp.616-631
    • /
    • 2007
  • Residential differentiation is an academic theme which has been given enormous attention in urban studies. This is due to the fact that residential segregation can be seen as one of the best indicators for socio-spatial dialectics occurring on urban space. Measuring how one population group is differentiated from the other group in terms of residential space has been a focal point in the residential segregation studies. The index of dissimilarity has been the most extensively used one. Despite its popularity, however, it has been accused of inability to capture the degree of spatial clustering that unevenly distributed population groups usually display. Further, the spatial indices of segregation which have been introduced to edify the problems of the index of dissimilarity also have some drawbacks: significance testing methods have never been provided; recent advances in spatial statistics have not been extensively exploited. Thus, the main purpose of the research is to devise a spatial separation measure which is expected to gauge not only how unevenly two population groups are distributed over urban space, but also how much the uneven distributions are spatially clustered (spatial dependence). The main results are as follows. First, a new measure is developed by integrating spatial association measures and spatial chi-square statistics. A significance testing method based on the generalized randomization test is also provided. Second, a case study of residential differentiation among groups by educational attainment in major Korean metropolitan cities clearly shows the applicability of the analytical framework presented in the paper.

Probabilistic reduced K-means cluster analysis (확률적 reduced K-means 군집분석)

  • Lee, Seunghoon;Song, Juwon
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.6
    • /
    • pp.905-922
    • /
    • 2021
  • Cluster analysis is one of unsupervised learning techniques used for discovering clusters when there is no prior knowledge of group membership. K-means, one of the commonly used cluster analysis techniques, may fail when the number of variables becomes large. In such high-dimensional cases, it is common to perform tandem analysis, K-means cluster analysis after reducing the number of variables using dimension reduction methods. However, there is no guarantee that the reduced dimension reveals the cluster structure properly. Principal component analysis may mask the structure of clusters, especially when there are large variances for variables that are not related to cluster structure. To overcome this, techniques that perform dimension reduction and cluster analysis simultaneously have been suggested. This study proposes probabilistic reduced K-means, the transition of reduced K-means (De Soete and Caroll, 1994) into a probabilistic framework. Simulation shows that the proposed method performs better than tandem clustering or clustering without any dimension reduction. When the number of the variables is larger than the number of samples in each cluster, probabilistic reduced K-means show better formation of clusters than non-probabilistic reduced K-means. In the application to a real data set, it revealed similar or better cluster structure compared to other methods.