• Title/Summary/Keyword: statistical clustering method

Search Result 231, Processing Time 0.023 seconds

Statistical Analysis about Ability to Mouse Embryonic Stem Cell Differentiation using cDNA Microarray

  • Choi, Hang-Suk;Kim, Sung-Ju;Lee, Young-Jin;Cha, Kyung-Joon;Kim, Chul-Geun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.4
    • /
    • pp.951-958
    • /
    • 2005
  • As a foundation study of stem cell applied research, it is necessary to identify the large gene expression through cDNA microarray to understand principles of the level of molecular about cell function. In this paper, we investigated the gene expression through the K-means clustering method and path analysis with genes related to pluripoteny and differentiation in an mouse early stage embryonic development process and embryonic stem cell differentiation. We find a few biological phenomenon through this study. Also, we realize that this process provides functional relationship of unknown genes.

  • PDF

Fingerprinting Bayesian Algorithm for Indoor Location Determination (실내 측위 결정을 위한 Fingerprinting Bayesian 알고리즘)

  • Lee, Jang-Jae;Kwon, Jang-Woo;Jung, Min-A;Lee, Seong-Ro
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.35 no.6B
    • /
    • pp.888-894
    • /
    • 2010
  • For the indoor positioning, wireless fingerprinting is most favorable because fingerprinting is most accurate among the technique for wireless network based indoor positioning which does not require any special equipments dedicated for positioning. The deployment of a fingerprinting method consists of off-line phase and on-line phase and more efficient and accurate methods have been studied. This paper proposes a bayesian algorithm for wireless fingerprinting and indoor location determination using fuzzy clustering with bayesian learning as a statistical learning theory.

MONITORING OF MOUNTAINOUS AREAS USING SIMULATED IMAGES TO KOMPSAT-II

  • Chang Eun-Mi;Shin Soo-Hyun
    • Proceedings of the KSRS Conference
    • /
    • 2005.10a
    • /
    • pp.653-655
    • /
    • 2005
  • More than 70 percent of terrestrial territory of Korea is mountainous areas where degradation becomes serious year by year due to illegal tombs, expanding golf courses and stone mine development. We elaborate the potential usage of high resolution image for the monitoring of the phenomena. We made the classification of tombs and the statistical radiometric characteristics of graves were identified from this project. The graves could be classified to 4 groups from the field survey. As compared with grouping data after clustering and discriminant analysis, the two results coincided with each other. Object-oriented classification algorithm for feature extraction was theoretically researched in this project. And we did a pilot project, which was performed with mixed methods. That is, the conventional methods such as unsupervised and supervised classification were mixed up with the new method for feature extraction, object-oriented classification method. This methodology showed about $60\%$ classification accuracy for extracting tombs from satellite imagery. The extraction of tombs' geographical coordinates and graves themselves from satellite image was performed in this project. The stone mines and golf courses are extracted by NDVI and GVI. The accuracy of classification was around 89 percent. The location accuracy showed extraction of tombs from one-meter resolution image is cheaper and quicker way than GPS method. Finally we interviewed local government officers and made analyses on the current situation of mountainous area management and potential usage of KOMPSAT-II images. Based on the requirement analysis, we developed software, which is to management and monitoring system for mountainous area for local government.

  • PDF

Design of Multiple Model Fuzzy Predictors using Data Preprocessing and its Application (데이터 전처리를 이용한 다중 모델 퍼지 예측기의 설계 및 응용)

  • Bang, Young-Keun;Lee, Chul-Heui
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.58 no.1
    • /
    • pp.173-180
    • /
    • 2009
  • It is difficult to predict non-stationary or chaotic time series which includes the drift and/or the non-linearity as well as uncertainty. To solve it, we propose an effective prediction method which adopts data preprocessing and multiple model TS fuzzy predictors combined with model selection mechanism. In data preprocessing procedure, the candidates of the optimal difference interval are determined based on the correlation analysis, and corresponding difference data sets are generated in order to use them as predictor input instead of the original ones because the difference data can stabilize the statistical characteristics of those time series and better reveals their implicit properties. Then, TS fuzzy predictors are constructed for multiple model bank, where k-means clustering algorithm is used for fuzzy partition of input space, and the least squares method is applied to parameter identification of fuzzy rules. Among the predictors in the model bank, the one which best minimizes the performance index is selected, and it is used for prediction thereafter. Finally, the error compensation procedure based on correlation analysis is added to improve the prediction accuracy. Some computer simulations are performed to verify the effectiveness of the proposed method.

A Study of Library Grouping using Cluster Analysis Methods (군집분석 기법을 이용한 공공도서관 그룹화에 대한 연구)

  • Kwak, Chul Wan
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.31 no.3
    • /
    • pp.79-99
    • /
    • 2020
  • The purpose of this study is to investigate the model of cluster analysis techniques for grouping public libraries and analyze their characteristics. Statistical data of public libraries of the National Library Statistics System were used, and three models of cluster analysis were applied. As a result of the study, cluster analysis was conducted based on the size of public libraries, and it was largely divided into two clusters. The size of the cluster was largely skewed to one side. For grouping based on size, the ward method of hierarchical cluster analysis and the k-means cluster analysis model were suitable. Three suggestions were presented as implications of the grouping method of public libraries. First, it is necessary to collect library service-related data in addition to statistical data. Second, an analysis model suitable for the data set to be analyzed must be applied. Third, it is necessary to study the possibility of using cluster analysis techniques in various fields other than library grouping.

Hydrogeochemistry and Statistical Analysis for Low and Intermediate Level Radioactive Waste Disposal Site in Gyeongju (경주 중·저준위 방폐장의 수리지화학 및 통계 분석)

  • Soon-Il Ok;Sieun Kim;Seongyeon Jung;Chung-Mo Lee
    • Journal of the Korean earth science society
    • /
    • v.44 no.6
    • /
    • pp.629-642
    • /
    • 2023
  • Currently, low and intermediate level radioactive waste is being disposed of at the Gyeongju disposal site for permanent isolation. Since 2006, the Korea Radioactive Waste Agency has been conducting site characteristics surveys continuously verifying changes in the site based on the site monitoring and investigation plan. The hydrogeochemical environment of the disposal site is considered for the evaluation of natural barriers. However, the seawater must be considered because of the regional characteristics of Gyeongju, which is near the East Sea. Therefore, this study, collected 30 samples for deriving the groundwater quality data from seven wells and compared with two seawater samples collected from October 2017 to June 2022. Additionally, the study explores the groundwater monitoring method using statistical tools such as clustering and background concentration analysis. The groundwater samples in the study area were classified into two to four clusters depending on their chemical constituents-especially, EC, HCO3, Na, and Cl-using statistical analysis, molar ratio, and K-means clustering.

Feature Extraction by Line-clustering Segmentation Method (선군집분할방법에 의한 특징 추출)

  • Hwang Jae-Ho
    • The KIPS Transactions:PartB
    • /
    • v.13B no.4 s.107
    • /
    • pp.401-408
    • /
    • 2006
  • In this paper, we propose a new class of segmentation technique for feature extraction based on the statistical and regional classification at each vertical or horizontal line of digital image data. Data is processed and clustered at each line, different from the point or space process. They are designed to segment gray-scale sectional images using a horizontal and vertical line process due to their statistical and property differences, and to extract the feature. The techniques presented here show efficient results in case of the gray level overlap and not having threshold image. Such images are also not easy to be segmented by the global or local threshold methods. Line pixels inform us the sectionable data, and can be set according to cluster quality due to the differences of histogram and statistical data. The total segmentation on line clusters can be obtained by adaptive extension onto the horizontal axis. Each processed region has its own pixel value, resulting in feature extraction. The advantage and effectiveness of the line-cluster approach are both shown theoretically and demonstrated through the region-segmental carotid artery medical image processing.

Recognition of License Plates Using a Hybrid Statistical Feature Model and Neural Networks (하이브리드 통계적 특징 모델과 신경망을 이용한 자동차 번호판 인식)

  • Lew, Sheen;Jeong, Byeong-Jun;Kang, Hyun-Chul
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.12
    • /
    • pp.1016-1023
    • /
    • 2009
  • A license plate recognition system consists of image processing in which characters and features are extracted, and pattern recognition in which extracted characters are classified. Feature extraction plays an important role in not only the level of data reduction but also performance of recognition. Thus, in this paper, we focused on the recognition of numeral characters especially on the feature extraction of numeral characters which has much effect in the result of plate recognition. We suggest a hybrid statistical feature model which assures the best dispersion of input data by reassignment of clustering property of input data. And we verify the effectiveness of suggested model using multi-layer perceptron and learning vector quantization neural networks. The results show that the proposed feature extraction method preserves the information of a license plate well and also is robust and effective for even noisy and external environment.

A Two-Stage Document Page Segmentation Method using Morphological Distance Map and RBF Network (거리 사상 함수 및 RBF 네트워크의 2단계 알고리즘을 적용한 서류 레이아웃 분할 방법)

  • Shin, Hyun-Kyung
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.9
    • /
    • pp.547-553
    • /
    • 2008
  • We propose a two-stage document layout segmentation method. At the first stage, as top-down segmentation, morphological distance map algorithm extracts a collection of rectangular regions from a given input image. This preliminary result from the first stage is employed as input parameters for the process of next stage. At the second stage, a machine-learning algorithm is adopted RBF network, one of neural networks based on statistical model, is selected. In order for constructing the hidden layer of RBF network, a data clustering technique bared on the self-organizing property of Kohonen network is utilized. We present a result showing that the supervised neural network, trained by 300 number of sample data, improves the preliminary results of the first stage.

Tree Based Cluster Analysis Using Reference Data (배경자료를 이용한 나무구조의 군집분석)

  • 최대우;구자용;최용석
    • The Korean Journal of Applied Statistics
    • /
    • v.17 no.3
    • /
    • pp.535-545
    • /
    • 2004
  • The clustering method suggested in this paper produces clusters based on the 'rules of variables' by merging the 'training' and the identically structured reference data and then by filtering it to obtain the clusters of the 'training data' through the use of the 'tree classification model'. The reference dataset is generated by spatially contrasting it to the 'training data' through the 'reverse arcing' algorithm to effectively identify the clusters. The strength of this method is that it can be applied even to the mixture of continuous and discrete types of 'training data' and the performance of this algorithm is illustrated by applying it to the simulated data as well as to the actual data.