• Title/Summary/Keyword: cluster sets

Search Result 223, Processing Time 0.025 seconds

Ammonia half-saturation constants of sludge with different community compositions of ammonia-oxidizing bacteria

  • Kayee, Pantip;Rongsayamanont, Chaiwat;Kunapongkiti, Pattaraporn;Limpiyakorn, Tawan
    • Environmental Engineering Research
    • /
    • v.21 no.2
    • /
    • pp.140-144
    • /
    • 2016
  • Owing to the kinetic differences in ammonia oxidation among ammonia-oxidizing microorganisms (AOM), there is no standard set of kinetic values that can be used as a representative set for nitrifying wastewater treatment plant (WWTP) design. As a result, this study clarified a link between the half-saturation constants for ammonia oxidation (Ks) and the dominant ammonia-oxidizing bacterial (AOB) groups in sludge from full-scale WWTPs and laboratory-scale nitrifying reactors. Quantitative polymerase chain reaction analyses revealed that AOB affiliated with the Nitrosomonas oligotropha cluster were the dominant AOM groups in the sludge taken from the low-ammonia-level WWTPs, while AOB associate with the Nitrosomonas europaea cluster comprised the majority of AOM groups in the sludge taken from the high-ammonia-level WWTPs and nitrifying reactors. A respirometric assay demonstrated that the ammonia Ks values for the high-ammonia-level WWTPs and nitrifying reactors were higher than those of the low-ammonia-level plants. Using the Ks values of available AOM cultures as a reference, the Ks values of the analyzed sludge were mainly influenced by the dominant AOB species. These findings implied that.different sets of kinetic values may be required for WWTPs with different dominant AOM species for more accurate WWTP design and operations.

Improved TI-FCM Clustering Algorithm in Big Data (빅데이터에서 개선된 TI-FCM 클러스터링 알고리즘)

  • Lee, Kwang-Kyug
    • Journal of IKEEE
    • /
    • v.23 no.2
    • /
    • pp.419-424
    • /
    • 2019
  • The FCM algorithm finds the optimal solution through iterative optimization technique. In particular, there is a difference in execution time depending on the initial center of clustering, the location of noise, the location and number of crowded densities. However, this method gradually updates the center point, and the center of the initial cluster is shifted to one side. In this paper, we propose a TI-FCM(Triangular Inequality-Fuzzy C-Means) clustering algorithm that determines the cluster center density by maximizing the distance between clusters using triangular inequality. The proposed method is an effective method to converge to real clusters compared to FCM even in large data sets. Experiments show that execution time is reduced compared to existing FCM.

Assessing Density Functional Theories to Compute the OH Stretching Frequencies of Water Molecules in Condensed Phases (응축상 물 분자의 OH 수축 진동수 계산을 위한 전자밀도 범함수 비교)

  • Kiyoung, Jeon;Mino, Yang
    • Journal of the Korean Chemical Society
    • /
    • v.67 no.1
    • /
    • pp.13-18
    • /
    • 2023
  • We evaluate electron density functional theories for the computation of 0-1 and 1-2 transition energies of local OH stretching motion of water molecules in condensed phases. By examining thirteen density functionals and nine sets of basis functions, it was found that the optimal combination that predicts the transition energies highly correlated with those calculated by the coupled cluster theory, CCSD(T), is the hybrid density functional theory developed by Head-Gordon group, ωB97X(D)/6-31+G*.

An Approach of Scalable SHIF Ontology Reasoning using Spark Framework (Spark 프레임워크를 적용한 대용량 SHIF 온톨로지 추론 기법)

  • Kim, Je-Min;Park, Young-Tack
    • Journal of KIISE
    • /
    • v.42 no.10
    • /
    • pp.1195-1206
    • /
    • 2015
  • For the management of a knowledge system, systems that automatically infer and manage scalable knowledge are required. Most of these systems use ontologies in order to exchange knowledge between machines and infer new knowledge. Therefore, approaches are needed that infer new knowledge for scalable ontology. In this paper, we propose an approach to perform rule based reasoning for scalable SHIF ontologies in a spark framework which works similarly to MapReduce in distributed memories on a cluster. For performing efficient reasoning in distributed memories, we focus on three areas. First, we define a data structure for splitting scalable ontology triples into small sets according to each reasoning rule and loading these triple sets in distributed memories. Second, a rule execution order and iteration conditions based on dependencies and correlations among the SHIF rules are defined. Finally, we explain the operations that are adapted to execute the rules, and these operations are based on reasoning algorithms. In order to evaluate the suggested methods in this paper, we perform an experiment with WebPie, which is a representative ontology reasoner based on a cluster using the LUBM set, which is formal data used to evaluate ontology inference and search speed. Consequently, the proposed approach shows that the throughput is improved by 28,400% (157k/sec) from WebPie(553/sec) with LUBM.

Automatic Clustering on Trained Self-organizing Feature Maps via Graph Cuts (그래프 컷을 이용한 학습된 자기 조직화 맵의 자동 군집화)

  • Park, An-Jin;Jung, Kee-Chul
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.9
    • /
    • pp.572-587
    • /
    • 2008
  • The Self-organizing Feature Map(SOFM) that is one of unsupervised neural networks is a very powerful tool for data clustering and visualization in high-dimensional data sets. Although the SOFM has been applied in many engineering problems, it needs to cluster similar weights into one class on the trained SOFM as a post-processing, which is manually performed in many cases. The traditional clustering algorithms, such as t-means, on the trained SOFM however do not yield satisfactory results, especially when clusters have arbitrary shapes. This paper proposes automatic clustering on trained SOFM, which can deal with arbitrary cluster shapes and be globally optimized by graph cuts. When using the graph cuts, the graph must have two additional vertices, called terminals, and weights between the terminals and vertices of the graph are generally set based on data manually obtained by users. The Proposed method automatically sets the weights based on mode-seeking on a distance matrix. Experimental results demonstrated the effectiveness of the proposed method in texture segmentation. In the experimental results, the proposed method improved precision rates compared with previous traditional clustering algorithm, as the method can deal with arbitrary cluster shapes based on the graph-theoretic clustering.

Statistical Analysis on the Quality of Surface Water in Jinhae Bay during Winter and Spring (동계와 춘계 진해만 표층수질에 대한 통계분석)

  • Kim, Dong-Seon;Choi, Hyun-Woo;Kim, Kyung-Hee;Jeong, Jin-Hyun;Baek, Seung-Ho;Kim, Yong-Ok
    • Ocean and Polar Research
    • /
    • v.33 no.3
    • /
    • pp.291-301
    • /
    • 2011
  • To investigate major factors controlling variations in water quality, principal component analysis and cluster analysis were used to analyze data sets of 12 parameters measured at 23 sampling stations of Jinhae Bay during winter and spring. Principal component analysis extracted three major factors controlling variations of water quality during winter and spring. In winter, major factors included freshwater input, polluted material input, and biological activity. Whereas in spring they were polluted material input, freshwater input, and suspended material input. The most distinct difference in the controlling factors between winter and spring was that the freshwater input was more important than the polluted material input in winter, but the polluted material input was more important than the freshwater input in spring. Cluster analysis grouped 23 sampling stations into four clusters in winter and five clusters in spring respectively. In winter, the four clusters were A (station 5), B (stations 1, 2), C (station 4), and D (the remaining stations). In spring, the five clusters included A (station 5), B (station 1), C (station 3), D (station 6), and E (the remaining stations). Intensive management of the water quality of Masan and Hangam bays could improve the water quality of Jinhae Bay since the polluted materials were mainly introduced into Jinhae Bay through Masan and Hangam bays.

Evaluation of Water Quality for the Han River Tributaries Using Multivariate Analysis (다변량 통계 분석기법을 이용한 한강수계 지천의 수질 평가)

  • Kim, Yo-Yong;Lee, Si-Jin
    • Journal of Korean Society of Environmental Engineers
    • /
    • v.33 no.7
    • /
    • pp.501-510
    • /
    • 2011
  • In this study, water pollution sources of 14 major tributaries of Han river and characteristics of water quality for each target streams were evaluated based on water quality data in 2007.1-2009.12 (14 data sets) using a statistical package, SPSS-17.0. Cluster analysis over time and space for each stream resulted in 4 groups for the spatial variations in which type and density of pollution sources in the basins showed the greatest impact on grouping. Moreover, cluster analysis for the time variation in which rainfall, temperature and eutrophication were shown to contribute to the clustering, produced 2 groups, from summer to fall (July-Oct.) and from winter to early summer (Nov.-June). Four factors were found as responsible for the data structure explaining 71-90% of the total variance of the data set depending on the streams and they were organic matter, nutrients, bacterial contamination. Factor analysis showed main factors (water pollutants) changed according to the season with different pattern for each stream. This study demonstrated that water quality of each stream could produce useful outcomes when factor and pollution source of basin were evaluated together.

Load-Balancing Rendezvous Approach for Mobility-Enabled Adaptive Energy-Efficient Data Collection in WSNs

  • Zhang, Jian;Tang, Jian;Wang, Zhonghui;Wang, Feng;Yu, Gang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.3
    • /
    • pp.1204-1227
    • /
    • 2020
  • The tradeoff between energy conservation and traffic balancing is a dilemma problem in Wireless Sensor Networks (WSNs). By analyzing the intrinsic relationship between cluster properties and long distance transmission energy consumption, we characterize three node sets of the cluster as a theoretical foundation to enhance high performance of WSNs, and propose optimal solutions by introducing rendezvous and Mobile Elements (MEs) to optimize energy consumption for prolonging the lifetime of WSNs. First, we exploit an approximate method based on the transmission distance from the different node to an ME to select suboptimal Rendezvous Point (RP) on the trajectory for ME to collect data. Then, we define data transmission routing sequence and model rendezvous planning for the cluster. In order to achieve optimization of energy consumption, we specifically apply the economic theory called Diminishing Marginal Utility Rule (DMUR) and create the utility function with regard to energy to develop an adaptive energy consumption optimization framework to achieve energy efficiency for data collection. At last, Rendezvous Transmission Algorithm (RTA) is proposed to better tradeoff between energy conservation and traffic balancing. Furthermore, via collaborations among multiple MEs, we design Two-Orbit Back-Propagation Algorithm (TOBPA) which concurrently handles load imbalance phenomenon to improve the efficiency of data collection. The simulation results show that our solutions can improve energy efficiency of the whole network and reduce the energy consumption of sensor nodes, which in turn prolong the lifetime of WSNs.

A Study on the Relationship between Skill and Competition Score Factors of KLPGA Players Using Canonical Correlation Biplot and Cluster Analysis (정준상관 행렬도와 군집분석을 응용한 KLPGA 선수의 기술과 경기성적요인에 대한 연관성 분석)

  • Choi, Tae-Hoon;Choi, Yong-Seok
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.3
    • /
    • pp.429-439
    • /
    • 2008
  • Canonical correlation biplot is 2-dimensional plot for investigating the relationship between two sets of variables and the relationship between observations and variables in canonical correlation analysis graphically. In general, biplot is useful for giving a graphical description of the data. However, this general biplot and also canonical correlation biplot do not give some concise interpretations between variables and observations when the number of observations are large. Recently, for overcoming this problem, Choi and Kim (2008) suggested a method to interpret the biplot analysis by applying the K-means clustering analysis. Therefore, in this study, we will apply their method for investigating the relationship between skill and competition score factors of KLPGA players using canonical correlation biplot and cluster analysis.

A Feature Selection Method Based on Fuzzy Cluster Analysis (퍼지 클러스터 분석 기반 특징 선택 방법)

  • Rhee, Hyun-Sook
    • The KIPS Transactions:PartB
    • /
    • v.14B no.2
    • /
    • pp.135-140
    • /
    • 2007
  • Feature selection is a preprocessing technique commonly used on high dimensional data. Feature selection studies how to select a subset or list of attributes that are used to construct models describing data. Feature selection methods attempt to explore data's intrinsic properties by employing statistics or information theory. The recent developments have involved approaches like correlation method, dimensionality reduction and mutual information technique. This feature selection have become the focus of much research in areas of applications with massive and complex data sets. In this paper, we provide a feature selection method considering data characteristics and generalization capability. It provides a computational approach for feature selection based on fuzzy cluster analysis of its attribute values and its performance measures. And we apply it to the system for classifying computer virus and compared with heuristic method using the contrast concept. Experimental result shows the proposed approach can give a feature ranking, select the features, and improve the system performance.