• Title/Summary/Keyword: determining the number of clusters

Search Result 26, Processing Time 0.021 seconds

Development of Market Growth Pattern Map Based on Growth Model and Self-organizing Map Algorithm: Focusing on ICT products (자기조직화 지도를 활용한 성장모형 기반의 시장 성장패턴 지도 구축: ICT제품을 중심으로)

  • Park, Do-Hyung;Chung, Jaekwon;Chung, Yeo Jin;Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.1-23
    • /
    • 2014
  • Market forecasting aims to estimate the sales volume of a product or service that is sold to consumers for a specific selling period. From the perspective of the enterprise, accurate market forecasting assists in determining the timing of new product introduction, product design, and establishing production plans and marketing strategies that enable a more efficient decision-making process. Moreover, accurate market forecasting enables governments to efficiently establish a national budget organization. This study aims to generate a market growth curve for ICT (information and communication technology) goods using past time series data; categorize products showing similar growth patterns; understand markets in the industry; and forecast the future outlook of such products. This study suggests the useful and meaningful process (or methodology) to identify the market growth pattern with quantitative growth model and data mining algorithm. The study employs the following methodology. At the first stage, past time series data are collected based on the target products or services of categorized industry. The data, such as the volume of sales and domestic consumption for a specific product or service, are collected from the relevant government ministry, the National Statistical Office, and other relevant government organizations. For collected data that may not be analyzed due to the lack of past data and the alteration of code names, data pre-processing work should be performed. At the second stage of this process, an optimal model for market forecasting should be selected. This model can be varied on the basis of the characteristics of each categorized industry. As this study is focused on the ICT industry, which has more frequent new technology appearances resulting in changes of the market structure, Logistic model, Gompertz model, and Bass model are selected. A hybrid model that combines different models can also be considered. The hybrid model considered for use in this study analyzes the size of the market potential through the Logistic and Gompertz models, and then the figures are used for the Bass model. The third stage of this process is to evaluate which model most accurately explains the data. In order to do this, the parameter should be estimated on the basis of the collected past time series data to generate the models' predictive value and calculate the root-mean squared error (RMSE). The model that shows the lowest average RMSE value for every product type is considered as the best model. At the fourth stage of this process, based on the estimated parameter value generated by the best model, a market growth pattern map is constructed with self-organizing map algorithm. A self-organizing map is learning with market pattern parameters for all products or services as input data, and the products or services are organized into an $N{\times}N$ map. The number of clusters increase from 2 to M, depending on the characteristics of the nodes on the map. The clusters are divided into zones, and the clusters with the ability to provide the most meaningful explanation are selected. Based on the final selection of clusters, the boundaries between the nodes are selected and, ultimately, the market growth pattern map is completed. The last step is to determine the final characteristics of the clusters as well as the market growth curve. The average of the market growth pattern parameters in the clusters is taken to be a representative figure. Using this figure, a growth curve is drawn for each cluster, and their characteristics are analyzed. Also, taking into consideration the product types in each cluster, their characteristics can be qualitatively generated. We expect that the process and system that this paper suggests can be used as a tool for forecasting demand in the ICT and other industries.

Applying the L-index for Analyzing the Density of Point Features (점사상 밀도 분석을 위한 L-지표의 적용)

  • Lee, Byoung-Kil
    • Spatial Information Research
    • /
    • v.16 no.2
    • /
    • pp.237-247
    • /
    • 2008
  • Statistical analysis of the coordinate information is regarded as one of the major GIS functions. Among them, one of the most fundamental analysis is density analysis of point features. For analyzing the density appropriately, determining the search radius, kernel radius, has critical importance. In this study, using L-index, known as its usefulness for choosing the kernel radius in previous researches, radius for density analysis of various point features are estimated, and the behavior of L-index is studied based on the estimated results. As results, L-index is not suitable to determine the search radius for the point features that are evenly distributed with small clusters, because the pattern of the L-index is depends on the size of the study area. But for the point features with small number of highly clustered areas, L-index is suitable, because the pattern of the L-index is not affected by the size of study area.

  • PDF

A Dynamic Clustering Mechanism Considering Energy Efficiency in the Wireless Sensor Network (무선 센서 네트워크에서 에너지 효율성을 고려한 동적 클러스터링 기법)

  • Kim, Hwan;Ahn, Sanghyun
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.2 no.5
    • /
    • pp.199-202
    • /
    • 2013
  • In the cluster mechanism of the wireless sensor network, the network lifetime is affected by how cluster heads are selected. One of the representative clustering mechanisms, the low-energy adaptive clustering hierarchy (LEACH), selects cluster heads periodically, resulting in high energy consumption in cluster reconstruction. On the other hand, the adaptive clustering algorithm via waiting timer (ACAWT) proposes a non-periodic re-clustering mechanism that reconstructs clusters if the remaining energy level of a cluster head reaches a given threshold. In this paper, we propose a re-clustering mechanism that uses multiple remaining node energy levels and does re-clustering when the remaining energy level of a cluster head reaches one level lower. Also, in determining cluster heads, both of the number of neighbor nodes and the remaining energy level are considered so that cluster heads can be more evenly placed. From the simulations based on the Qualnet simulator, we validate that our proposed mechanism outperforms ACAWT in terms of the network lifetime.

An Indexing System for Retrieving Similar Paths in XML Documents (XML 문서의 유사 경로 검색을 위한 인덱싱 시스템)

  • Lee, Bum-Suk;Hwang, Byung-Yeon
    • The KIPS Transactions:PartD
    • /
    • v.15D no.2
    • /
    • pp.171-178
    • /
    • 2008
  • Since the XML standard was introduced by the W3C in 1998, documents that have been written in XML have been gradually increasing. Accordingly, several systems have been developed in order to efficiently manage and retrieve massive XML documents. BitCube-a bitmap indexing system-is a representative system for this field of research. Based on the bitmap indexing technique, the path bitmap indexing system(LH06), which performs the clustering of similar paths, improved the problem that the existing BitCube system could not solve, namely, determining similar paths. The path bitmap indexing system has the advantage of a higher retrieval speed in not only exactly matched path searching but also similar path searching. However, the similarity calculation algorithm of this system has a few particular problems. Consequently, it sometimes cannot calculate the similarity even though some of two paths have extremely similar relationships; further, it results in an increment in the number of meaningless clusters. In this paper, we have proposed a novel method that clustering, the similarity between the paths in order to solve these problems. The proposed system yields a stable result for clustering, and it obtains a high score in clustering precision during a performance evaluation against LH06.

An Exploratory Methodology for Longitudinal Data Analysis Using SOM Clustering (자기조직화지도 클러스터링을 이용한 종단자료의 탐색적 분석방법론)

  • Cho, Yeong Bin
    • Journal of Convergence for Information Technology
    • /
    • v.12 no.5
    • /
    • pp.100-106
    • /
    • 2022
  • A longitudinal study refers to a research method based on longitudinal data repeatedly measured on the same object. Most of the longitudinal analysis methods are suitable for prediction or inference, and are often not suitable for use in exploratory study. In this study, an exploratory method to analyze longitudinal data is presented, which is to find the longitudinal trajectory after determining the best number of clusters by clustering longitudinal data using self-organizing map technique. The proposed methodology was applied to the longitudinal data of the Employment Information Service, and a total of 2,610 samples were analyzed. As a result of applying the methodology to the actual data applied, time-series clustering results were obtained for each panel. This indicates that it is more effective to cluster longitudinal data in advance and perform multilevel longitudinal analysis.

Analysis of Genetic Characteristics and Probability of Individual Discrimination in Korean Indigenous Chicken Brands by Microsatellite Marker (MS 마커를 이용한 토종닭 브랜드의 유전적 특성 및 개체 식별력 분석)

  • Suh, Sangwon;Cho, Chang-Yeon;Kim, Jae-Hwan;Choi, Seong-Bok;Kim, Young-Sin;Kim, Hyun;Seong, Hwan-Hoo;Lim, Hyun-Tae;Cho, Jae-Hyeon;Ko, Yeoung-Gyu
    • Journal of Animal Science and Technology
    • /
    • v.55 no.3
    • /
    • pp.185-194
    • /
    • 2013
  • Microsatellite markers have been a useful genetic tool in determining diversity, relationships and individual discrimination studies of livestock. The level of genetic diversity, relationships among two Korean indigenous chicken brand populations (Woorimatdag: WR, Hanhyup3: HH) as well as two pure populations (White Leghorn: WL, Rhode Island Red: RIR) were analyzed, based on 26 MS markers. A total of 191 distinct alleles were observed across the four chicken populations, and 47 (24.6%) of these alleles were unique to only one population. The mean $H_{Exp}$ and PIC were estimated as 0.667 and 0.630. Nei's $D_A$ genetic distance and factorial correspondence analysis (FCA) showed that the four populations represented four distinct groups. However, the genetic distance between each Korean indigenous chicken brand (WR, HH) and the pure population (WL, RIR) were threefold that among the WR and HH. For the STRUCTURE analyses, the most appropriate number of clusters for modeling the data was determined to be three. The expected probabilities of identity among genotypes of random individuals (PI) were calculated as $1.17{\times}10^{-49}$ (All 26 markers) and $1.14{\times}10^{-15}$, $7.33{\times}10^{-20}$ (9, 12 with the highest PI value, respectively). The results indicated that the brand chicken breed traceability system employing the own highest PI value 9 to 12 markers, and might be applicable to individual identification of Korean indigenous chicken brand.