• Title/Summary/Keyword: index clustering

Search Result 323, Processing Time 0.026 seconds

Community Analysis of Urban Forest around city of Seoul (서울시 근교에 위치한 도시숲 군집구조 분석)

  • Ro, Yu-Mi;Kang, Heejun;Lee, Sang-don
    • Korean Journal of Environment and Ecology
    • /
    • v.29 no.4
    • /
    • pp.599-604
    • /
    • 2015
  • This study was conducted the vegetation clustering analysis for the 3 mountains of Mt. Bulam, Mt. Daemo, Mt. Bonghwa which were the urban forests of Seoul. Based on the results of the analysis related to the vegetation clustering, it was found that the clustering of Mt. Bulam consisted of pine trees(Pinus densiflora)-Mongolian oak(Quercus mongolica), Hornb eam(Carpinus laxiflora)-Pitch pine(P. rigida), oriental oak(Q. variabilis) - a wild pear tree(Sorbus alnifolia) while the clustering of Mt. Daemo consisted of Pitch pine-Japanese larch(Larix leptolepis), Poplar(Populus tomentig landulosa)- black birch(Betula davurica pall). Meanwhile, the clustering of Mt. Bonghwa consisted of pine trees-a wild pear tree Community and Sawtooth oak(Q. acutissima)-Cherry Blossoms(Prunus serrulata). In relation to the similarity index by region in Mt. Bulam, Mt. Daemo, and Mt. Bonghwa, the similarity index of Mt. Bulam and Mt. Daemo stood at as high as 0.634, suggesting the distribution of similar vegetation, and the dominance index of the Mt. Daemo region was found to be 0.166 which suggests the dominance of many species compared to other regions. In addition, the results of species diversity showed that Mt. Daemo had the highest stability, and the species diversity, maximum species diversity, evenness indices were highest in Mt. Bulam, followed by Mt. Bonghwa and Mt. Daemo. The dominance index was the lowest in Mt. Bulam, followed by Mt. Bonghwa and Mt. Daemo.

2D-THI: Two-Dimensional Type Hierarchy Index for XML Databases (2D-THI: XML 데이테베이스를 위한 이차원 타입상속 계층색인)

  • Lee Jong-Hak
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.3
    • /
    • pp.265-278
    • /
    • 2006
  • This paper presents a two-dimensional type inheritance hierarchy index(2D-THI) for XML databases. XML Schema is one of schema models for the XML documents supporting. The type inheritance. The conventional indexing techniques for XML databases can not support XML queries on type inheritance hierarchies. We construct a two-dimensional index structure using multidimensional file organizations for supporting type inheritance hierarchy in XML queries. This indexing technique deals with the problem of clustering index entries in the two-dimensional domain space that consists of a key element domain and a type identifier domain based on the user query pattern. This index enhances query performance by adjusting the degree of clustering between the two domains. For performance evaluation, we have compared our proposed 2D-THI with the conventional class hierarchy indexing techniques in object-oriented databases such as CH-index and CG-tree through the cost model. As the result of the performance evaluations, we have verified that our proposed two-dimensional type inheritance indexing technique can efficiently support the query Processing in XML databases according to the query types.

  • PDF

Clustering of Metabolic Risk Factors and Its Related Risk Factors in Young Schoolchildren (초등학교 저학년 어린이에서의 대사위험요인 군집의 분포와 관련 위험요인)

  • Kong, Kyoung-Ae;Park, Bo-Hyun;Min, Jung-Won;Hong, Ju-Hee;Hong, Young-Sun;Lee, Bo-Eun;Chang, Nam-Soo;Lee, Sun-Hwa;Ha, Eun-Hee;Park, Hye-Sook
    • Journal of Preventive Medicine and Public Health
    • /
    • v.39 no.3
    • /
    • pp.235-242
    • /
    • 2006
  • Objectives: We wanted to determine the distribution of the clustering of the metabolic risk factors and we wanted to evaluate the related factors in young schoolchildren. Methods: A cross-sectional study of metabolic syndrome was conducted in an elementary school in Seoul, Korea. We evaluated fasting glucose, triglyceride, HDL cholesterol, blood pressures and the body mass index, and we used parent-reported questionnaires to assess the potential risk factors in 261 children (136 boys, 125 girls). We defined the metabolic risk factors as obesity or at risk for obesity ($\geqq$ 85th percentile for age and gender), a systolic or diastolic blood pressure at $\geqq90th$ percentile for age and gender, fasting glucose at $\geqq110mg/dl$, triglyceride at $\geqq110mg/dl$ and HDL cholesterol at $\leqq40mg/dl$. Results: There were 15.7% of the subjects who showed clustering of two or more metabolic risk factors, 2.3% of the subjects who showed clustering for three or more risk factors, and 0.8% of the subjects who showed clustering for four or more risk factors. A multivariate analysis revealed that a father smoking more than 20 cigarettes per day, a mother with a body mass index of = $25kg/m^2$, and the child eating precooked or frozen food more than once per day were associated with clustering of two or more components, with the odds ratios of 3.61 (95% CI=1.24-10.48), 5.50 (95% CI=1.39-21.73) and 8.04 (95% CI=1.67-38.81), respectively. Conclusions: This study shows that clustering of the metabolic risk factors is present in young schoolchildren in Korea, with the clustering being associated with parental smoking and obesity as well as the child's eating behavior. These results suggest that evaluation of metabolic risk factors and intervention for lifestyle factors may be needed in both young Korean children and their parents.

A Dimensionality Assessment for Polytomously Scored Items Using DETECT

  • Kim, Hae-Rim
    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.2
    • /
    • pp.597-603
    • /
    • 2000
  • A versatile dimensionality assessment index DETECT has been developed for binary item response data by Kim (1994). The present paper extends the use of DETECT to the polytomously scored item data. A simulation study shows DETECT performs well in differentiating multidimensional data from unidimensional one by yielding a greater value of DETECT in the case of multidimensionality. An additional investigation is necessary for the dimensionally meaningful clustering methods, such as HAC for binary data, particularly sensitive to the polytomous data.

  • PDF

Document Clustering using Generic Algorithm and Cluster Measurement (클러스터 측정과 유전자 알고리즘을 이용한 문서 클러스터링)

  • Choi, Lim Cheon;Park, Soon Cheol
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2010.11a
    • /
    • pp.490-493
    • /
    • 2010
  • 본 논문에서는 클러스터 측정(Cluster Measurement)과 유전자 알고리즘을 이용한 문서 클러스링 알고리즘을 제안한다. 유전자 알고리즘의 요소를 클러스터링에 대입하고 클러스터 측정을 적합도 함수에 대입하여 문서 클러스터링을 구현하였다. 성능 평가를 위하여 한국일보-20000/한국일보-40075 문서범주화 실험문서집합의 데이터 셋을 이용하였다. 클러스터링 성능 평가 결과 AS Index가 DB Index, RS Index 보다 좋은 성능을 보여준다. 또한 제안한 알고리즘이 K-means 클러스터링 알고리즘에 비교해 안정적으로 좋은 성능을 보여준다.

A Study On Optimization Of Fuzzy-Neural Network Using Clustering Method And Genetic Algorithm (클러스터링 기법 및 유전자 알고리즘을 이용한 퍼지 뉴럴 네트워크 모델의 최적화에 관한 연구)

  • Park, Chun-Seong;Yoon, Ki-Chan;Park, Byoung-Jun;Oh, Sung-Kwun
    • Proceedings of the KIEE Conference
    • /
    • 1998.07b
    • /
    • pp.566-568
    • /
    • 1998
  • In this paper, we suggest a optimal design method of Fuzzy-Neural Networks model for complex and nonlinear systems. FNNs have the stucture of fusion of both fuzzy inference with linguistic variables and Neural Networks. The network structure uses the simpified inference as fuzzy inference system and the BP algorithm as learning procedure. And we use a clustering algorithm to find initial parameters of membership function. The parameters such as membership functions, learning rates and momentum coefficients are easily adjusted using the genetic algorithms. Also, the performance index with weighted value is introduced to achieve a meaningful balance between approximation and generalization abilities of the model. To evaluate the performance index, we use the time series data for gas furnace and the sewage treatment process.

  • PDF

A Study on Intellectual Structure of Library and Information Science in Korea (문헌정보학의 지식 구조에 관한 연구)

  • Yoo, Yeong-Jun
    • Journal of the Korean Society for information Management
    • /
    • v.20 no.3
    • /
    • pp.277-297
    • /
    • 2003
  • This study was conducted upon the premise that index terms display the intellectual structure of a specific subject field. In this study, and attempt was made to grasp the intellectual structure of Library and Information. Science by clustering the index terms of the journals of the related academic societies at the Library of National Assembly - such as the Journal of the Korean Society for Information Management, the Journal of the Korean Library and Information Science Society, and the Journal of the Korean Society for Library and Information Science. Through the course of the study, index term clusters were generated based on the linkage of the index terms and the frequency of co-occurrence, and moreover, time periods analysis was conducted along with studies on first-appearing terms, in order to clarify the trend and development process of the Library and Information Science. This study also analysed the difference between two intellectual structure by comparing the structure generated by index term clusters with the existing structure of traditional classification systems.

A Neuro-Fuzzy Modeling using the Hierarchical Clustering and Gaussian Mixture Model (계층적 클러스터링과 Gaussian Mixture Model을 이용한 뉴로-퍼지 모델링)

  • Kim, Sung-Suk;Kwak, Keun-Chang;Ryu, Jeong-Woong;Chun, Myung-Geun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.5
    • /
    • pp.512-519
    • /
    • 2003
  • In this paper, we propose a neuro-fuzzy modeling to improve the performance using the hierarchical clustering and Gaussian Mixture Model(GMM). The hierarchical clustering algorithm has a property of producing unique parameters for the given data because it does not use the object function to perform the clustering. After optimizing the obtained parameters using the GMM, we apply them as initial parameters for Adaptive Network-based Fuzzy Inference System. Here, the number of fuzzy rules becomes to the cluster numbers. From this, we can improve the performance index and reduce the number of rules simultaneously. The proposed method is verified by applying to a neuro-fuzzy modeling for Box-Jenkins s gas furnace data and Sugeno's nonlinear system, which yields better results than previous oiles.

k-Bitmap Clustering Method for XML Data based on Relational DBMS (관계형 DBMS 기반의 XML 데이터를 위한 k-비트맵 클러스터링 기법)

  • Lee, Bum-Suk;Hwang, Byung-Yeon
    • The KIPS Transactions:PartD
    • /
    • v.16D no.6
    • /
    • pp.845-850
    • /
    • 2009
  • Use of XML data has been increased with growth of Web 2.0 environment. XML is recognized its advantages by using based technology of RSS or ATOM for transferring information from blogs and news feed. Bitmap clustering is a method to keep index in main memory based on Relational DBMS, and which performed better than the other XML indexing methods during the evaluation. Existing method generates too many clusters, and it causes deterioration of result of searching quality. This paper proposes k-Bitmap clustering method that can generate user defined k clusters to solve above-mentioned problem. The proposed method also keeps additional inverted index for searching excluded terms from representative bits of k-Bitmap. We performed evaluation and the result shows that the users can control the number of clusters. Also our method has high recall value in single term search, and it guarantees the searching result includes all related documents for its query with keeping two indices.

Hierarchical Overlapping Clustering to Detect Complex Concepts (중복을 허용한 계층적 클러스터링에 의한 복합 개념 탐지 방법)

  • Hong, Su-Jeong;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.111-125
    • /
    • 2011
  • Clustering is a process of grouping similar or relevant documents into a cluster and assigning a meaningful concept to the cluster. By this process, clustering facilitates fast and correct search for the relevant documents by narrowing down the range of searching only to the collection of documents belonging to related clusters. For effective clustering, techniques are required for identifying similar documents and grouping them into a cluster, and discovering a concept that is most relevant to the cluster. One of the problems often appearing in this context is the detection of a complex concept that overlaps with several simple concepts at the same hierarchical level. Previous clustering methods were unable to identify and represent a complex concept that belongs to several different clusters at the same level in the concept hierarchy, and also could not validate the semantic hierarchical relationship between a complex concept and each of simple concepts. In order to solve these problems, this paper proposes a new clustering method that identifies and represents complex concepts efficiently. We developed the Hierarchical Overlapping Clustering (HOC) algorithm that modified the traditional Agglomerative Hierarchical Clustering algorithm to allow overlapped clusters at the same level in the concept hierarchy. The HOC algorithm represents the clustering result not by a tree but by a lattice to detect complex concepts. We developed a system that employs the HOC algorithm to carry out the goal of complex concept detection. This system operates in three phases; 1) the preprocessing of documents, 2) the clustering using the HOC algorithm, and 3) the validation of semantic hierarchical relationships among the concepts in the lattice obtained as a result of clustering. The preprocessing phase represents the documents as x-y coordinate values in a 2-dimensional space by considering the weights of terms appearing in the documents. First, it goes through some refinement process by applying stopwords removal and stemming to extract index terms. Then, each index term is assigned a TF-IDF weight value and the x-y coordinate value for each document is determined by combining the TF-IDF values of the terms in it. The clustering phase uses the HOC algorithm in which the similarity between the documents is calculated by applying the Euclidean distance method. Initially, a cluster is generated for each document by grouping those documents that are closest to it. Then, the distance between any two clusters is measured, grouping the closest clusters as a new cluster. This process is repeated until the root cluster is generated. In the validation phase, the feature selection method is applied to validate the appropriateness of the cluster concepts built by the HOC algorithm to see if they have meaningful hierarchical relationships. Feature selection is a method of extracting key features from a document by identifying and assigning weight values to important and representative terms in the document. In order to correctly select key features, a method is needed to determine how each term contributes to the class of the document. Among several methods achieving this goal, this paper adopted the $x^2$�� statistics, which measures the dependency degree of a term t to a class c, and represents the relationship between t and c by a numerical value. To demonstrate the effectiveness of the HOC algorithm, a series of performance evaluation is carried out by using a well-known Reuter-21578 news collection. The result of performance evaluation showed that the HOC algorithm greatly contributes to detecting and producing complex concepts by generating the concept hierarchy in a lattice structure.