• Title/Summary/Keyword: Density-independent clustering

Search Result 11, Processing Time 0.031 seconds

Improvement on Density-Independent Clustering Method (밀도에 무관한 클러스터링 기법의 개선)

  • Kim, Seong-Hoon;Heo, Gyeongyong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.5
    • /
    • pp.967-973
    • /
    • 2017
  • Clustering is one of the most well-known unsupervised learning methods that clusters data into homogeneous groups. Clustering has been used in various applications and FCM is one of the representative methods. In Fuzzy C-Means(FCM), however, cluster centers tend leaning to high density areas because the Euclidean distance measure forces high density clusters to make more contribution to clustering result. Previously proposed was density-independent clustering method, where cluster centers were made not to be close each other and relived the center deviation problem. Density-independent clustering method has a limitation that it is difficult to specify the position of the cluster centers. In this paper, an enhanced density-independent clustering method with an additional term that makes cluster centers to be placed around dense region is proposed. The proposed method converges more to real centers compared to FCM and density-independent clustering, which can be verified with experimental results.

Improved Density-Independent Fuzzy Clustering Using Regularization (레귤러라이제이션 기반 개선된 밀도 무관 퍼지 클러스터링)

  • Han, Soowhan;Heo, Gyeongyong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.1
    • /
    • pp.1-7
    • /
    • 2020
  • Fuzzy clustering, represented by FCM(Fuzzy C-Means), is a simple and efficient clustering method. However, the object function in FCM makes clusters affect clustering results proportional to the density of clusters, which can distort clustering results due to density difference between clusters. One method to alleviate this density problem is EDI-FCM(Extended Density-Independent FCM), which adds additional terms to the objective function of FCM to compensate for the density difference. In this paper, proposed is an enhanced EDI-FCM using regularization, Regularized EDI-FCM. Regularization is commonly used to make a solution space smooth and an algorithm noise insensitive. In clustering, regularization can reduce the effect of a high-density cluster on clustering results. The proposed method converges quickly and accurately to real centers when compared with FCM and EDI-FCM, which can be verified with experimental results.

Study on mapping of dark matter clustering from real space to redshift space

  • Zheng, Yi;Song, Yong-Seon
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.41 no.1
    • /
    • pp.38.2-38.2
    • /
    • 2016
  • The mapping of dark matter clustering from real to redshift spaces introduces the anisotropic property to the measured density power spectrum in redshift space, known as the Redshift Space Distortion (hereafter RSD) effect. The mapping formula is intrinsically non-linear, which is complicated by the higher order polynomials due to the indefinite cross correlations between the density and velocity fields, and the Finger-of-God (hereafter FoG) effect due to the randomness of the peculiar velocity field. Furthermore, the rigorous test of this mapping formula is contaminated by the unknown non-linearity of the density and velocity fields, including their auto- and cross-correlations, for calculating which our theoretical calculation breaks down beyond some scales. Whilst the full higher order polynomials remains unknown, the other systematics can be controlled consistently within the same order truncation in the expansion of the mapping formula, as shown in this paper. The systematic due to the unknown non-linear density and velocity fields is removed by separately measuring all terms in the expansion using simulations. The uncertainty caused by the velocity randomness is controlled by splitting the FoG term into two pieces, 1) the non-local FoG term being independent of the separation vector between two different points, and 2) the local FoG term appearing as an indefinite polynomials which is expanded in the same order as all other perturbative polynomials. Using 100 realizations of simulations, we find that the best fitted non-local FoG function is Gaussian, with only one scale-independent free parameter, and that our new mapping formulation accurately reproduces the observed power spectrum in redshift space at the smallest scales by far, up to k ~ 0.3 h/Mpc, considering the resolution of future experiments.

  • PDF

ESTIMATION OF THE POWER PEAKING FACTOR IN A NUCLEAR REACTOR USING SUPPORT VECTOR MACHINES AND UNCERTAINTY ANALYSIS

  • Bae, In-Ho;Na, Man-Gyun;Lee, Yoon-Joon;Park, Goon-Cherl
    • Nuclear Engineering and Technology
    • /
    • v.41 no.9
    • /
    • pp.1181-1190
    • /
    • 2009
  • Knowing more about the Local Power Density (LPD) at the hottest part of a nuclear reactor core can provide more important information than knowledge of the LPD at any other position. The LPD at the hottest part needs to be estimated accurately in order to prevent the fuel rod from melting in a nuclear reactor. Support Vector Machines (SVMs) have successfully been applied in classification and regression problems. Therefore, in this paper, the power peaking factor, which is defined as the highest LPD to the average power density in a reactor core, was estimated by SVMs which use numerous measured signals of the reactor coolant system. The SVM models were developed by using a training data set and validated by an independent test data set. The SVM models' uncertainty was analyzed by using 100 sampled training data sets and verification data sets. The prediction intervals were very small, which means that the predicted values were very accurate. The predicted values were then applied to the first fuel cycle of the Yonggwang Nuclear Power Plant Unit 3. The root mean squared error was approximately 0.15%, which is accurate enough for use in LPD monitoring and for core protection that uses LPD estimation.

Social Network Analysis for the Effective Adoption of Recommender Systems (추천시스템의 효과적 도입을 위한 소셜네트워크 분석)

  • Park, Jong-Hak;Cho, Yoon-Ho
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.305-316
    • /
    • 2011
  • Recommender system is the system which, by using automated information filtering technology, recommends products or services to the customers who are likely to be interested in. Those systems are widely used in many different Web retailers such as Amazon.com, Netfix.com, and CDNow.com. Various recommender systems have been developed. Among them, Collaborative Filtering (CF) has been known as the most successful and commonly used approach. CF identifies customers whose tastes are similar to those of a given customer, and recommends items those customers have liked in the past. Numerous CF algorithms have been developed to increase the performance of recommender systems. However, the relative performances of CF algorithms are known to be domain and data dependent. It is very time-consuming and expensive to implement and launce a CF recommender system, and also the system unsuited for the given domain provides customers with poor quality recommendations that make them easily annoyed. Therefore, predicting in advance whether the performance of CF recommender system is acceptable or not is practically important and needed. In this study, we propose a decision making guideline which helps decide whether CF is adoptable for a given application with certain transaction data characteristics. Several previous studies reported that sparsity, gray sheep, cold-start, coverage, and serendipity could affect the performance of CF, but the theoretical and empirical justification of such factors is lacking. Recently there are many studies paying attention to Social Network Analysis (SNA) as a method to analyze social relationships among people. SNA is a method to measure and visualize the linkage structure and status focusing on interaction among objects within communication group. CF analyzes the similarity among previous ratings or purchases of each customer, finds the relationships among the customers who have similarities, and then uses the relationships for recommendations. Thus CF can be modeled as a social network in which customers are nodes and purchase relationships between customers are links. Under the assumption that SNA could facilitate an exploration of the topological properties of the network structure that are implicit in transaction data for CF recommendations, we focus on density, clustering coefficient, and centralization which are ones of the most commonly used measures to capture topological properties of the social network structure. While network density, expressed as a proportion of the maximum possible number of links, captures the density of the whole network, the clustering coefficient captures the degree to which the overall network contains localized pockets of dense connectivity. Centralization reflects the extent to which connections are concentrated in a small number of nodes rather than distributed equally among all nodes. We explore how these SNA measures affect the performance of CF performance and how they interact to each other. Our experiments used sales transaction data from H department store, one of the well?known department stores in Korea. Total 396 data set were sampled to construct various types of social networks. The dependant variable measuring process consists of three steps; analysis of customer similarities, construction of a social network, and analysis of social network patterns. We used UCINET 6.0 for SNA. The experiments conducted the 3-way ANOVA which employs three SNA measures as dependant variables, and the recommendation accuracy measured by F1-measure as an independent variable. The experiments report that 1) each of three SNA measures affects the recommendation accuracy, 2) the density's effect to the performance overrides those of clustering coefficient and centralization (i.e., CF adoption is not a good decision if the density is low), and 3) however though the density is low, the performance of CF is comparatively good when the clustering coefficient is low. We expect that these experiment results help firms decide whether CF recommender system is adoptable for their business domain with certain transaction data characteristics.

Skin Pigmentation Detection Using Projection Transformed Block Coefficient (투영 변환 블록 계수를 이용한 피부 색소 침착 검출)

  • Liu, Yang;Lee, Suk-Hwan;Kwon, Seong-Geun;Kwon, Ki-Ryong
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.9
    • /
    • pp.1044-1056
    • /
    • 2013
  • This paper presents an approach for detecting and measuring human skin pigmentation. In the proposed scheme, we extract a skin area by a GMM-EM clustering based skin color model that is estimated from the statistical analysis of training images and remove tiny noises through the morphology processing. A skin area is decomposed into two components of hemoglobin and melanin by an independent component analysis (ICA) algorithm. Then, we calculate the intensities of hemoglobin and melanin by using the projection transformed block coefficient and determine the existence of skin pigmentation according to the global and local distribution of two intensities. Furthermore, we measure the area and density of the detected skin pigmentation. Experimental results verified that our scheme can both detect the skin pigmentation and measure the quantity of that and also our scheme takes less time because of the location histogram.

A Study on the Influence of Commercial Facility Diversity on the Formation of Consumption Centre: Application of Spatial Regression Models (상업시설의 다양성이 소비중심지 형성에 미치는 영향에 관한 연구: 공간회귀모형의 적용)

  • Sul-Hee Kim;Heung-Soon Kim
    • Land and Housing Review
    • /
    • v.15 no.1
    • /
    • pp.57-75
    • /
    • 2024
  • To create dynamic and bustling urban environments, a diverse array of commercial facilities is indispensable. These facilities are recognised as pivotal in attracting and accommodating a larger floating population, thereby suggesting that a greater diversity of commercial establishments fosters heightened consumer expenditure. With this premise, our study endeavours to explore the influence of commercial facility diversity on the Consumer Centre Index. Focused on the temporal context of 2021 and the spatial context of Seoul, our analysis utilizes the Consumer Centre Index, derived from Kernel Density analysis, as the dependent variable. Independent variables encompass factors reflecting commercial attributes and urban characteristics. Employing spatial regression analysis at the administrative district level, we discern that the clustering of similar industries exerts a more pronounced positive effect on consumer activation compared to the clustering of disparate industries. Additionally, the findings underscore the importance of concentrating industries that bolster consumer activation. Anticipated outcomes of this study include insights beneficial for optimizing commercial facility location policies within the consumer market.

Application of Bioinformatics for the Functional Genomics Analysis of Prostate Cancer Therapy

  • Mousses, Spyro
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2000.11a
    • /
    • pp.74-82
    • /
    • 2000
  • Prostate cancer initially responds and regresses in response to androgen depletion therapy, but most human prostate cancers will eventually recur, and re-grow as an androgen independent tumor. Once these tumors become hormone refractory, they usually are incurable leading to death for the patient. Little is known about the molecular details of how prostate cancer cells regress following androgen ablation and which genes are involved in the androgen independent growth following the development of resistance to therapy. Such knowledge would reveal putative drug targets useful in the rational therapeutic design to prevent therapy resistance and control androgen independent growth. The application of genome scale technologies have permitted new insights into the molecular mechanisms associated with these processes. Specifically, we have applied functional genomics using high density cDNA microarray analysis for parallel gene expression analysis of prostate cancer in an experimental xenograft system during androgen withdrawal therapy, and following therapy resistance, The large amount of expression data generated posed a formidable bioinformatics challenge. A novel template based gene clustering algorithm was developed and applied to the data to discover the genes that respond to androgen ablation. The data show restoration of expression of androgen dependent genes in the recurrent tumors and other signaling genes. Together, the discovered genes appear to be involved in prostate cancer cell growth and therapy resistance in this system. We have also developed and applied tissue microarray (TMA) technology for high throughput molecular analysis of hundreds to thousands of clinical specimens simultaneously. TMA analysis was used for rapid clinical translation of candidate genes discovered by cDNA microarray analysis to determine their clinical utility as diagnostic, prognostic, and therapeutic targets. Finally, we have developed a bioinformatic approach to combine pharmacogenomic data on the efficacy and specificity of various drugs to target the discovered prostate cancer growth associated candidate genes in an attempt to improve current therapeutics.

  • PDF

Effect of Tempering Treatment on Mechanical Properties of Ausformed Martensite in Fe-30% Ni-0.35%C Alloy (Fe-30%Ni-0.35%C 합금에서 Ausformed Martensite의 기계적 성질에 미치는 Tempering처리의 영향)

  • Lee, E.K.;Lee, K.B.;Kim, H.S.
    • Journal of the Korean Society for Heat Treatment
    • /
    • v.7 no.1
    • /
    • pp.44-52
    • /
    • 1994
  • In order to investigate the effect of tempring treatment on the mechanical properties of ausformed martensite in Fe-30%Ni-0.35%C alloy, the hardness, yield strength and elongation were examined by tensile test. 1. The strength of deformed austenite in Fe-30%Ni-0.35%C alloy was increased due to the work hardening induced from the dislocation density increased during deformation. The strength of ausformed martensite was increased because of defects inherited from deformed austenite by martensitic transformation. 2. The ductility of ausformed martensite was shown a nearly constant values independent of deformation degrees because of the interaction of multiple factors such as increased retained austenite, formation of void and decrement of twin in ausformed martensite. 3. The strength of ausformed martensite by tempering treatment was shown a little decrement up to $340^{\circ}C$, especially showed remarkable softening resistance in higher deformation degrees. 4. Virgin martensite and ausformed martensite were shown a maximum yield strength by clustering in tempering at $100^{\circ}C$ and above $100^{\circ}C$, yield strength was very small decreased due to the decrement of solute carbon by the destruction of clustering. 5. The decomposition of retained austenite was not shown up to $450^{\circ}C$ in ausformed martensite with tempering treatment, and the matrix was rapidly softening because of the decomposition of martensite and the formation of reversed austenite with tempering above $400^{\circ}C$.

  • PDF

Association between hemoglobin glycation index and cardiometabolic risk factors in Korean pediatric nondiabetic population

  • Lee, Bora;Heo, You Jung;Lee, Young Ah;Lee, Jieun;Kim, Jae Hyun;Lee, Seong Yong;Shin, Choong Ho;Yang, Sei Won
    • Annals of Pediatric Endocrinology and Metabolism
    • /
    • v.23 no.4
    • /
    • pp.196-203
    • /
    • 2018
  • Purpose: The hemoglobin glycation index (HGI) represents the degree of nonenzymatic glycation and has been positively associated with cardiometabolic risk factors (CMRFs) and cardiovascular disease in adults. This study aimed to investigate the association between HGI, components of metabolic syndrome (MS), and alanine aminotransferase (ALT) in a pediatric nondiabetic population. Methods: Data from 3,885 subjects aged 10-18 years from the Korea National Health and Nutrition Examination Survey (2011-2016) were included. HGI was defined as subtraction of predicted glycated hemoglobin ($HbA1_c$) from measured $HbA1_c$. Participants were divided into 3 groups according to HGI tertile. Components of MS (abdominal obesity, fasting glucose, triglycerides, high-density lipoprotein cholesterol, and blood pressure), and proportion of MS, CMRF clustering (${\geq}2$ of MS components), and elevated ALT were compared among the groups. Results: Body mass index (BMI) z-score, obesity, total cholesterol, ALT, abdominal obesity, elevated triglycerides, and CMRF clustering showed increasing HGI trends from lower-to-higher tertiles. Multiple logistic regression analysis showed the upper HGI tertile was associated with elevated triglycerides (odds ratio, 1.65; 95% confidence interval, 1.18-2.30). Multiple linear regression analysis showed HGI level was significantly associated with BMI z-score, $HbA1_c$, triglycerides, and ALT. When stratified by sex, age group, and BMI category, overweight/obese subjects showed linear HGI trends for presence of CMRF clustering and ALT elevation. Conclusion: HGI was associated with CMRFs in a Korean pediatric population. High HGI might be an independent risk factor for CMRF clustering and ALT elevation in overweight/obese youth. Further studies are required to establish the clinical relevance of HGI for cardiometabolic health in youth.