• 제목/요약/키워드: maximum cluster size

검색결과 38건 처리시간 0.031초

Optimizing the maximum reported cluster size for normal-based spatial scan statistics

  • Yoo, Haerin;Jung, Inkyung
    • Communications for Statistical Applications and Methods
    • /
    • 제25권4호
    • /
    • pp.373-383
    • /
    • 2018
  • The spatial scan statistic is a widely used method to detect spatial clusters. The method imposes a large number of scanning windows with pre-defined shapes and varying sizes on the entire study region. The likelihood ratio test statistic comparing inside versus outside each window is then calculated and the window with the maximum value of test statistic becomes the most likely cluster. The results of cluster detection respond sensitively to the shape and the maximum size of scanning windows. The shape of scanning window has been extensively studied; however, there has been relatively little attention on the maximum scanning window size (MSWS) or maximum reported cluster size (MRCS). The Gini coefficient has recently been proposed by Han et al. (International Journal of Health Geographics, 15, 27, 2016) as a powerful tool to determine the optimal value of MRCS for the Poisson-based spatial scan statistic. In this paper, we apply the Gini coefficient to normal-based spatial scan statistics. Through a simulation study, we evaluate the performance of the proposed method. We illustrate the method using a real data example of female colorectal cancer incidence rates in South Korea for the year 2009.

알루미늄 덩어리를 사용한 알루미늄 성장에 관한 분자동력학 연구 (Molecular Dynamics study of Aluminum growth using Aluminum Cluster Deposition)

  • J.W. Kang;K.R. Byun;W.H. Mun;E.S. Kang;H.J. Hwang
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2000년도 하계종합학술대회 논문집(2)
    • /
    • pp.306-309
    • /
    • 2000
  • In this work, we investigated A1 cluster deposition on Al (100) surface using molecular dynamics simulation. A result of simulations showed that large cluster with low energy was proper for good surfaced-films without craters at the low temperatures. We investigated the maximum substrate temperature and the time taken for substrate temperature to reach its maximum as a function of cluster size in the case of the same total energy and in the case of the same energy Per atom. The correlated collisions play an important role in interaction between energetic cluster and surface, and as cluster size and cluster energy increases, the correlated collisions effect affects interaction between energetic cluster and surface.

  • PDF

CONDENSATION IN DENSITY DEPENDENT ZERO RANGE PROCESSES

  • Jeon, Intae
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • 제17권4호
    • /
    • pp.267-278
    • /
    • 2013
  • We consider zero range processes with density dependent jump rates g given by $g=g(n,k)=g_1(n)g_2(k/n)$ with $g_1(x)=x^{-\alpha}$ and $$g_2(x)=\{^{x^{-\alpha}\;if\;a&lt;x}_{Mx^{-\alpha}\;if\;x{\leq}a}$$. (0.1) In this case, with 1/2 < a < 1 and ${\alpha}$ > 0, we show that non-complete condensation occurs with maximum cluster size an. More precisely, for any ${\epsilon}$ > 0, there exists $M^*$ > 0 such that, for any 0 < M ${\leq}M^*$, the maximum cluster size is between (a - ${\epsilon}$)n and (a + ${\epsilon}$)n for large n. This provides a simple example of non-complete condensation under perturbation of rates which are deep in the range of perfect condensation (e.g. ${\alpha}$ >> 1) and supports the instability of the condensation transition.

Improved Classification Algorithm using Extended Fuzzy Clustering and Maximum Likelihood Method

  • Jeon Young-Joon;Kim Jin-Il
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2004년도 ICEIC The International Conference on Electronics Informations and Communications
    • /
    • pp.447-450
    • /
    • 2004
  • This paper proposes remotely sensed image classification method by fuzzy c-means clustering algorithm using average intra-cluster distance. The average intra-cluster distance acquires an average of the vector set belong to each cluster and proportionates to its size and density. We perform classification according to pixel's membership grade by cluster center of fuzzy c-means clustering using the mean-values of training data about each class. Fuzzy c-means algorithm considered membership degree for inter-cluster of each class. And then, we validate degree of overlap between clusters. A pixel which has a high degree of overlap applies to the maximum likelihood classification method. Finally, we decide category by comparing with fuzzy membership degree and likelihood rate. The proposed method is applied to IKONOS remote sensing satellite image for the verifying test.

  • PDF

Mass Spectrometric Study of Carbon Cluster Formation in Laser Ablation of Graphite at 355 nm

  • Koo, Young-Mi;Choi, Young-Ku;Lee, Kee-Hag;Jung, Kwang-Woo
    • Bulletin of the Korean Chemical Society
    • /
    • 제23권2호
    • /
    • pp.309-314
    • /
    • 2002
  • The ablation dynamics and cluster formation of $C_n^+$ ions ejected from 355 nm laser ablation of a graphite target in vacuum are investigated using a reflectron time-of-flight (RTOF) mass spectrometer. At low laser fluence, odd-numbered cluster ions with $3{\leq}n{\leq}15$ are predominantly produced. Increasing the laser fluence shifts the maximum size distribution towards small cluster ions, implying the fragmentation of larger clusters within the hot plume. The temporal evolution of $C_n^+$ ions was measured by varying the delay time of the ion extraction pulse with respect to the laser irradiation, providing significant information on the characteristics of the ablated plume. Above a laser fluence of $0.2J/cm^2$ , large cluster ions ($n{\geq}30$) are produced at relatively long delay times, indicating that atoms or small carbon clusters aggregate during plume propagation. The dependence of the intensity of ablated $C_n^+$ ions on delay time after laser irradiation shows that the most probable velocity of each cluster ion decreases with cluster size.

스파크를 이용한 머신러닝의 분산 처리 성능 요인 (Performance Factor of Distributed Processing of Machine Learning using Spark)

  • 류우석
    • 한국전자통신학회논문지
    • /
    • 제16권1호
    • /
    • pp.19-24
    • /
    • 2021
  • 본 논문에서는 아파치 스파크를 이용하여 머신러닝을 분산 처리할 때의 성능 요인을 분석하고 효율적인 분산 처리를 위한 실행 환경을 실험을 통해 제시한다. 먼저, 분산 클러스터 환경에서 머신러닝을 수행할 때 고려해야 하는 성능 요인으로 클러스터의 성능, 데이터의 규모, 스파크 엔진의 속성으로 구분하여 분석한다. 그리고 하둡 클러스터에서 동작하는 스파크 MLlib을 이용하여 회귀분석을 수행할 때 노드의 구성과 스파크 Executor의 설정을 변화하면서 성능을 측정한다. 실험 결과 최적의 Executor 개수는 데이터의 블록의 수에 영향을 받으나 클러스터 규모에 따라 최대값, 최소값은 각각 코어의 수, 워커 노드의 수로 제한됨을 실증하였다.

이단계표본추출을 이용한 소결핵병 유병률 추정 (Two-stage Sampling for Estimation of Prevalence of Bovine Tuberculosis)

  • 박선일
    • 한국임상수의학회지
    • /
    • 제28권4호
    • /
    • pp.422-426
    • /
    • 2011
  • For a national survey in which wide geographic region or an entire country is targeted, multi-stage sampling approach is widely used to overcome the problem of simple random sampling, to consider both herd- and animallevel factors associated with disease occurrence, and to adjust clustering effect of disease in the population in the calculation of sample size. The aim of this study was to establish sample size for estimating bovine tuberculosis (TB) in Korea using stratified two-stage sampling design. The sample size was determined by taking into account the possible clustering of TB-infected animals on individual herds to increase the reliability of survey results. In this study, the country was stratified into nine provinces (administrative unit) and herd, the primary sampling unit, was considered as a cluster. For all analyses, design effect of 2, between-cluster prevalence of 50% to yield maximum sample size, and mean herd size of 65 were assumed due to lack of information available. Using a two-stage sampling scheme, the number of cattle sampled per herd was 65 cattle, regardless of confidence level, prevalence, and mean herd size examined. Number of clusters to be sampled at a 95% level of confidence was estimated to be 296, 74, 33, 19, 12, and 9 for desired precision of 0.01, 0.02, 0.03, 0.04, 0.05, and 0.06, respectively. Therefore, the total sample size with a 95% confidence level was 172,872, 43,218, 19,224, 10,818, 6,930, and 4,806 for desired precision ranging from 0.01 to 0.06. The sample size was increased with desired precision and design effect. In a situation where the number of cattle sampled per herd is fixed ranging from 5 to 40 with a 5-head interval, total sample size with a 95% confidence level was estimated to be 6,480, 10,080, 13,770, 17,280, 20.925, 24,570, 28,350, and 31,680, respectively. The percent increase in total sample size resulting from the use of intra-cluster correlation coefficient of 0.3 was 22.2, 32.1, 36.3, 39.6, 41.9, 42.9, 42,2, and 44.3%, respectively in comparison to the use of coefficient of 0.2.

Maximizing Information Transmission for Energy Harvesting Sensor Networks by an Uneven Clustering Protocol and Energy Management

  • Ge, Yujia;Nan, Yurong;Chen, Yi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권4호
    • /
    • pp.1419-1436
    • /
    • 2020
  • For an energy harvesting sensor network, when the network lifetime is not the only primary goal, maximizing the network performance under environmental energy harvesting becomes a more critical issue. However, clustering protocols that aim at providing maximum information throughput have not been thoroughly explored in Energy Harvesting Wireless Sensor Networks (EH-WSNs). In this paper, clustering protocols are studied for maximizing the data transmission in the whole network. Based on a long short-term memory (LSTM) energy predictor and node energy consumption and supplement models, an uneven clustering protocol is proposed where the cluster head selection and cluster size control are thoroughly designed for this purpose. Simulations and results verify that the proposed scheme can outperform some classic schemes by having more data packets received by the cluster heads (CHs) and the base station (BS) under these energy constraints. The outcomes of this paper also provide some insights for choosing clustering routing protocols in EH-WSNs, by exploiting the factors such as uneven clustering size, number of clusters, multiple CHs, multihop routing strategy, and energy supplementing period.

손 관련 인체측정자료를 이용한 한국인의 손 모양 유형 분류 및 특성 분석 (Classification and Identification of Korean Hand Shapes based on Anthropometric Hand Data Analysis)

  • 김상호;기도형
    • 대한안전경영과학회지
    • /
    • 제14권1호
    • /
    • pp.75-85
    • /
    • 2012
  • In this study, the representative hand shapes for the adult Koreans were analyzed by factor analysis and cluster analyses. The analyses were conducted on the anthropometric data of 58 hand dimensions from 325 subjects having nonhomogeneous demographics. Maximum hand circumference, first phalanx length of index finger, and ratio between the two measures were the independent variables for the cluster analyses. The results of the study showed that Korean hand shapes can be divided into 2 clusters irrespective of their size for each of the male and female group. There were slight differences in component ratio of hand shapes with respect to the occupation and the age, but their differences were not statistically significant. The representative Korean hand shapes and their anthrpometric dimensions could be used to design and establish proper sizing system for various hand operating devices.

다중 분할된 구조를 가지는 클러스터 검사점 저장 기법 (A Multistriped Checkpointing Scheme for the Fault-tolerant Cluster Computers)

  • 장윤석
    • 정보처리학회논문지A
    • /
    • 제13A권7호
    • /
    • pp.607-614
    • /
    • 2006
  • 검사점 저장 기법을 사용하여 주기적으로 클러스터 노드들의 프로세스 수행 정보를 전역 저장 장치에 저장하는 분산 클러스터 시스템에서 결함 허용 성능을 유지하는 데 드는 비용을 줄이고 전체 프로세스의 수행 성능을 증가시키기 위해서는 검사점 정보를 저장할 때에 네트워크로 전달되는 부하를 각 노드에 최대한 적절하게 분산하여 데이터 저장 시간을 줄임으로써 검사점 정보를 저장하는 동안 전체 클러스터 시스템의 프로세스가 지연되는 시간을 줄이도록 하여야 한다. 이를 위하여 분산 RAID 기반의 단일 입출력 공간을 사용하는. 클러스터 시스템에서는 여러가지 검사점 저장 기법을 사용하며, 검사점 정보의 저장 기법에 따라서 저장 성능과 결함 회복 성능이 달라진다. 본 연구에서는 분할된 검사점 저장 기법을 개선하여 검사점 데이터를 분산 RAID 기반의 단일 입출력 공간에 저장할 때에 그룹별로 분할되는 분할 그룹 크기를 검사점 정보가 저장될 때의 네트워크의 트래픽에 따라서 동적으로 결정하여 네트워크를 통한 분산 RAID에 저장함으로써 네트워크 병목현상을 최소화하는 다중 분할된 검사점 저장 구조를 제안하였다. 제안된 구조의 성능을 분석하기 위하여 최대 512개의 가상 노드로 구성된 클러스터 시스템을 대상으로 하여 MPI 와 Linpack HPC 벤치마크를 통한 성능 평가를 수행하였으며, 성능 평가 결과는 검사점 정보의 크기와 클러스터의 크기가 증가할수록 제안된 기법이 검사점 정보의 저장과 결함 회복 능력에 대하여 기존의 검사점 저장 기법에 비하여 우수한 성능을 보인다.