• Title/Summary/Keyword: Classification of Clusters

Search Result 351, Processing Time 0.023 seconds

Hangul Component Decomposition in Outline Fonts (한글 외곽선 폰트의 자소 분할)

  • Koo, Sang-Ok;Jung, Soon-Ki
    • Journal of the Korea Computer Graphics Society
    • /
    • v.17 no.4
    • /
    • pp.11-21
    • /
    • 2011
  • This paper proposes a method for decomposing a Hangul glyph of outline fonts into its initial, medial and final components using statistical-structural information. In a font family, the positions of components are statistically consistent and the stroke relationships of a Hangul character reflect its structure. First, we create the component histograms that accumulate the shapes and positions of the same components. Second, we make pixel clusters from character image based on pixel direction probabilities and extract the candidate strokes using position, direction, size of clusters and adjacencies between clusters. Finally, we find the best structural match between candidate strokes and predefined character model by relaxation labeling. The proposed method in this paper can be used for a study on formative characteristics of Hangul font, and for a font classification/retrieval system.

Exploratory Study on the Quality Grade of Korea Black Raspberry Wines by Using Consumer Preference Data (시판 복분자주의 기호도 분석을 통한 탐색적 등급 분류)

  • Lee, Seung-Joo
    • Korean Journal of Food Science and Technology
    • /
    • v.46 no.3
    • /
    • pp.352-357
    • /
    • 2014
  • In this study, 100 consumers (men, 50; women, 50; age group, 20-50 years) rated their overall preferences for 24 Korean raspberry wines by using a 9-point hedonic scale. The analysis of variance was constructed to evaluate the effect of gender, age, and samples on the preference scores of the wine products. Significant differences were observed in overall preferences for the 24 samples; however, no interactions based on preferences by age and gender groups were noted. Cluster analysis was performed to determine sample clustering based on the frequencies from the preference data. Three clusters were obtained; these three clusters were well separated based on the mean overall preference scores for the samples. Discriminant analysis based on the three clusters also confirmed the same grouping of samples with 100% accuracy.

THEORETICAL STUDY ON OBSERVED COLOR-MAGNITUDE DIAGRAMS

  • Lee, See-Woo
    • Journal of The Korean Astronomical Society
    • /
    • v.12 no.1
    • /
    • pp.41-70
    • /
    • 1979
  • From $B\ddot{o}hm$-Vitense's atmospheric model calculations, the relations, [$T_e$, (B-V)] and [B.C, (B-V)] with respect to heavy element abundance were obtained. Using these relations and evolutionary model calculations of Rood, and Sweigart and Gross, analytic expressions for some physical parameters relating to the C-M diagrams of globular clusters were derived, and they were applied to 21 globular clusters with observed transition periods of RR Lyrae variables. More than 20 different parameters were examined for each globular cluster. The derived ranges of some basic parameters are as follows; $Y=0.21{\sim}0.33,\;Z=1.5{\times}10^{-4}{\sim}4.5{\times}10^{-3},\;age,\;t=9.5{\sim}19{\times}10^9$ years, mass for red giants, $m_{RG}=0.74m_{\odot}{\sim}0.91m_{\odot}$, mass for RR Lyrae stars, $m_{RR}=0.59m_{\odot}{\sim}0.75m_{\odot}$, the visual magnitude difference between the turnoff point and the horizontal branch (HB), ${\Delta}V_{to}=3.1{\sim}3.4(<{\Delta}V_{to}>=3.32)$, the color of the blue edge of RR Lyrae gap, $(B-V)_{BE}=0.17{\sim}0.21=(<(B-V)_{BE}>=0.18),\;[\frac{m}{L}]_{RR}=-1.7{\sim}-1.9$, mass difference of $m_{RR}$ relative to $m_{RG},(m_{RG}-m_{RR})/m_{RG}=0.0{\sim}0.39$. It was found that the ranges of derived parameters agree reasonably well with the observed ones and those estimated by others. Some important results obtained herein can be summarized as follows; (i) There are considerable variations in the initial helium abundance and in age of globular clusters. (ii) The radial gradient of heavy element abundance does exist for globular clusters as shown by Janes for field stars and open clusters. (iii) The helium abundance seems to have been increased with age by massive star evolution after a considerable amount (Y>0.2) of helium had been attained by the Big-Bang nucleosynthesis, but there is not seen a radial gradient of helium abundance. (iv) A considerable amount of heavy elements ($Z{\sim}10{-3}$) might have been formed in the inner halo ($r_{GC}$<10 kpc) from the earliest galactic co1lapse, and then the heavy element abundance has been slowly enriched towards the galactic center and disk, establishing the radial gradient of heavy element abundance. (v) The final galactic disk formation might have taken much longer by about a half of the galactic age than the halo formation, supporting a slow, inhomogeneous co1lapse model of Larson. (vi) Of the three principal parameters controlling the morphology of C-M diagrams, it was found that the first parameter is heavy clement abundance, the second age and the third helium abundance. (vii) The globular clusters can be divided into three different groups, AI, BI and CII according to Z, Y an d age as well as Dickens' HB types. BI group clusters of HB types 4 and 5 like M 3 and NGC 7006 are the oldest and have the lowest helium abundance of the three groups. And also they appear in the inner halo. On the other hand, the youngest AI clusters have the highest Z and Y, and appear in the innermost halo region and in the disk. (viii) From the result of the clean separations of the clusters into three groups, a three dimensional classification with three parameters, Z, Y and age is prsented. (ix) The anomalous C-M diagrams can be expalined in terms of the three principal parameters. That is, the anomaly of NGC 362 and NGC 7006 is accounted for by the smaller age of the order of $1{\sim}2{\times}10^9$ years rather than by the helium abundance difference, compared with M 3. (x) The difference in two Oosterhoff types I and II can be explained in terms of the mean mass difference of RR Lyrae variables rather than in terms of the helium abundance difference as suggested by Stobie. The mean mass of the variables in Oosterhoff type I clusters is smaller by $0.074m_{\odot}$ which is exactly consistent with Rood's estimate. Since it was found that the mean mass of RR Lyrae stars increases with decreasing Z, the two Oosterhoff types can be explained substantially by the metal abundance difference; the type II has Z<$3.4{\times}10^{-4}$, and the type I has higher Z than the type II.

  • PDF

Magnifying Block Diagonal Structure for Spectral Clustering (스펙트럼 군집화에서 블록 대각 형태의 유사도 행렬 구성)

  • Heo, Gyeong-Yong;Kim, Kwang-Baek;Woo, Young-Woon
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.9
    • /
    • pp.1302-1309
    • /
    • 2008
  • Traditional clustering methods, like k-means or fuzzy clustering, are prototype-based methods which are applicable only to convex clusters. On the other hand, spectral clustering tries to find clusters only using local similarity information. Its ability to handle concave clusters has gained the popularity recent years together with support vector machine (SVM) which is a kernel-based classification method. However, as is in SVM, the kernel width plays an important role and has a great impact on the result. Several methods are proposed to decide it automatically, it is still determined based on heuristics. In this paper, we proposed an adaptive method deciding the kernel width based on distance histogram. The proposed method is motivated by the fact that the affinity matrix should be formed into a block diagonal matrix to generate the best result. We use the tradition Euclidean distance together with the random walk distance, which make it possible to form a more apparent block diagonal affinity matrix. Experimental results show that the proposed method generates more clear block structured affinity matrix than the existing one does.

  • PDF

Improvement of the PFCM(Possibilistic Fuzzy C-Means) Clustering Method (PFCM 클러스터링 기법의 개선)

  • Heo, Gyeong-Yong;Choe, Se-Woon;Woo, Young-Woon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.1
    • /
    • pp.177-185
    • /
    • 2009
  • Cluster analysis or clustering is a kind of unsupervised learning method in which a set of data points is divided into a given number of homogeneous groups. Fuzzy clustering method, one of the most popular clustering method, allows a point to belong to all the clusters with different degrees, so produces more intuitive and natural clusters than hard clustering method does. Even more some of fuzzy clustering variants have noise-immunity. In this paper, we improved the Possibilistic Fuzzy C-Means (PFCM), which generates a membership matrix as well as a typicality matrix, using Gath-Geva (GG) method. The proposed method has a focus on the boundaries of clusters, which is different from most of the other methods having a focus on the centers of clusters. The generated membership values are suitable for the classification-type applications. As the typicality values generated from the algorithm have a similar distribution with the values of density function of Gaussian distribution, it is useful for Gaussian-type density estimation. Even more GG method can handle the clusters having different numbers of data points, which the other well-known method by Gustafson and Kessel can not. All of these points are obvious in the experimental results.

The Characteristics of Korean Family Law - A Comparison with EU-Countries in Regard to Regime Classification - (한국 가족법의 특수성 - EU 국가와의 비교를 통한 유형 구분 -)

  • Chung, Yun Tag
    • Korean Journal of Social Welfare Studies
    • /
    • v.41 no.4
    • /
    • pp.161-187
    • /
    • 2010
  • This study begins with two research interests. Firstly, there seems to be a break of research in the field of family policy in Korea which exists especially in regard to family law. Family law was originally the core of state interventions in family life, but has been neglected because of the lack of literature with comparative research methods. This shortcoming needs to be addressed. Secondly, through inquiry into the definition of family or family policy with the lens of the law, the definition of family or family policy can be correctly extended. With these two interests combined, this research tries to derive an analytical tool - maintenance community - of the law and compare some important points of the family law of Korea with those of 16 EU-countries in terms of regime classification. The method used is, firstly, to describe the subjects of family law with a focus on partnering and parenting without subjective interpretation, and secondly, to classify the countries' family-law regimes with the criteria of privacy and autonomy using cluster analysis. The results show that the countries can be classified into three clusters: Nordic (Norway and Sweden), West-Northern (Denmark, France, England, Finland, and Belgium) and Middle South (Italy, Spain, Austria, Portugal, Netherlands, Greece, Ireland, Germany, and Korea). This result can be compared to a precedent research result which showed that 21 OECD countries can be classified in three clusters according to family policy. The number of the clusters is the same as this study, but some countries belong to other clusters; for example Denmark and Finland belong to the Nordic cluster according to family policy, while they belong to the West-Northern according to family law, and Austria, Germany, and Ireland belong to the Middle-South cluster according to family law, while they belong to the Continental according to family policy. From this result we can interpret Korean family law to be in the middle range according to both criteria of privacy and autonomy like other South-European countries including some Continental countries. We can make some theoretical suggestions. The fact that both family law and family policy regimes in countries can be classified into three clusters can be interpreted to mean that there exists parallelism between family law and family policy in a broad sense. But from the fact that some countries belong to different clusters according to family law and family policy, we can say that the family policy in a country is not always consistent with family law.

Online news-based stock price forecasting considering homogeneity in the industrial sector (산업군 내 동질성을 고려한 온라인 뉴스 기반 주가예측)

  • Seong, Nohyoon;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.1-19
    • /
    • 2018
  • Since stock movements forecasting is an important issue both academically and practically, studies related to stock price prediction have been actively conducted. The stock price forecasting research is classified into structured data and unstructured data, and it is divided into technical analysis, fundamental analysis and media effect analysis in detail. In the big data era, research on stock price prediction combining big data is actively underway. Based on a large number of data, stock prediction research mainly focuses on machine learning techniques. Especially, research methods that combine the effects of media are attracting attention recently, among which researches that analyze online news and utilize online news to forecast stock prices are becoming main. Previous studies predicting stock prices through online news are mostly sentiment analysis of news, making different corpus for each company, and making a dictionary that predicts stock prices by recording responses according to the past stock price. Therefore, existing studies have examined the impact of online news on individual companies. For example, stock movements of Samsung Electronics are predicted with only online news of Samsung Electronics. In addition, a method of considering influences among highly relevant companies has also been studied recently. For example, stock movements of Samsung Electronics are predicted with news of Samsung Electronics and a highly related company like LG Electronics.These previous studies examine the effects of news of industrial sector with homogeneity on the individual company. In the previous studies, homogeneous industries are classified according to the Global Industrial Classification Standard. In other words, the existing studies were analyzed under the assumption that industries divided into Global Industrial Classification Standard have homogeneity. However, existing studies have limitations in that they do not take into account influential companies with high relevance or reflect the existence of heterogeneity within the same Global Industrial Classification Standard sectors. As a result of our examining the various sectors, it can be seen that there are sectors that show the industrial sectors are not a homogeneous group. To overcome these limitations of existing studies that do not reflect heterogeneity, our study suggests a methodology that reflects the heterogeneous effects of the industrial sector that affect the stock price by applying k-means clustering. Multiple Kernel Learning is mainly used to integrate data with various characteristics. Multiple Kernel Learning has several kernels, each of which receives and predicts different data. To incorporate effects of target firm and its relevant firms simultaneously, we used Multiple Kernel Learning. Each kernel was assigned to predict stock prices with variables of financial news of the industrial group divided by the target firm, K-means cluster analysis. In order to prove that the suggested methodology is appropriate, experiments were conducted through three years of online news and stock prices. The results of this study are as follows. (1) We confirmed that the information of the industrial sectors related to target company also contains meaningful information to predict stock movements of target company and confirmed that machine learning algorithm has better predictive power when considering the news of the relevant companies and target company's news together. (2) It is important to predict stock movements with varying number of clusters according to the level of homogeneity in the industrial sector. In other words, when stock prices are homogeneous in industrial sectors, it is important to use relational effect at the level of industry group without analyzing clusters or to use it in small number of clusters. When the stock price is heterogeneous in industry group, it is important to cluster them into groups. This study has a contribution that we testified firms classified as Global Industrial Classification Standard have heterogeneity and suggested it is necessary to define the relevance through machine learning and statistical analysis methodology rather than simply defining it in the Global Industrial Classification Standard. It has also contribution that we proved the efficiency of the prediction model reflecting heterogeneity.

Classification of Magnetic Resonance Imagery Using Deterministic Relaxation of Neural Network (신경망의 결정론적 이완에 의한 자기공명영상 분류)

  • 전준철;민경필;권수일
    • Investigative Magnetic Resonance Imaging
    • /
    • v.6 no.2
    • /
    • pp.137-146
    • /
    • 2002
  • Purpose : This paper introduces an improved classification approach which adopts a deterministic relaxation method and an agglomerative clustering technique for the classification of MRI using neural network. The proposed approach can solve the problems of convergency to local optima and computational burden caused by a large number of input patterns when a neural network is used for image classification. Materials and methods : Application of Hopfield neural network has been solving various optimization problems. However, major problem of mapping an image classification problem into a neural network is that network is opt to converge to local optima and its convergency toward the global solution with a standard stochastic relaxation spends much time. Therefore, to avoid local solutions and to achieve fast convergency toward a global optimization, we adopt MFA to a Hopfield network during the classification. MFA replaces the stochastic nature of simulated annealing method with a set of deterministic update rules that act on the average value of the variable. By minimizing averages, it is possible to converge to an equilibrium state considerably faster than standard simulated annealing method. Moreover, the proposed agglomerative clustering algorithm which determines the underlying clusters of the image provides initial input values of Hopfield neural network. Results : The proposed approach which uses agglomerative clustering and deterministic relaxation approach resolves the problem of local optimization and achieves fast convergency toward a global optimization when a neural network is used for MRI classification. Conclusion : In this paper, we introduce a new paradigm to classify MRI using clustering analysis and deterministic relaxation for neural network to improve the classification results.

  • PDF

Design of Fuzzy System with Hierarchical Classifying Structures and its Application to Time Series Prediction (계층적 분류구조의 퍼지시스템 설계 및 시계열 예측 응용)

  • Bang, Young-Keun;Lee, Chul-Heui
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.5
    • /
    • pp.595-602
    • /
    • 2009
  • Fuzzy rules, which represent the behavior of their system, are sensitive to fuzzy clustering techniques. If the classification abilities of such clustering techniques are improved, their systems can work for the purpose more accurately because the capabilities of the fuzzy rules and parameters are enhanced by the clustering techniques. Thus, this paper proposes a new hierarchically structured clustering algorithm that can enhance the classification abilities. The proposed clustering technique consists of two clusters based on correlationship and statistical characteristics between data, which can perform classification more accurately. In addition, this paper uses difference data sets to reflect the patterns and regularities of the original data clearly, and constructs multiple fuzzy systems to consider various characteristics of the differences suitably. To verify effectiveness of the proposed techniques, this paper applies the constructed fuzzy systems to the field of time series prediction, and performs prediction for nonlinear time series examples.

Analysis of Land-cover Types Using Multistage Hierarchical flustering Image Classification (다단계 계층군집 영상분류법을 이용한 토지 피복 분석)

  • 이상훈
    • Korean Journal of Remote Sensing
    • /
    • v.19 no.2
    • /
    • pp.135-147
    • /
    • 2003
  • This study used the multistage hierarchical clustering image classification to analyze the satellite images for the land-cover types of an area in the Korean peninsula. The multistage algorithm consists of two stages. The first stage performs region-growing segmentation by employing a hierarchical clustering procedure with the restriction that pixels in a cluster must be spatially contiguous, and finally the whole image space is segmented into sub-regions where adjacent regions have different physical properties. Without spatial constraints for merging, the second stage clusters the segments resulting from the previous stage. The image classification of hierarchical clustering, which merges step-by step two small groups into one large one based on the hierarchical structure of digital imagery, generates a hierarchical tree of the relation between the classified regions. The experimental results show that the hierarchical tree has the detailed information on the hierarchical structure of land-use and more detailed spectral information is required for the correct analysis of land-cover types.