• Title/Summary/Keyword: cluster sets

Search Result 223, Processing Time 0.027 seconds

Development of novel microsatellite markers to analyze the genetic structure of dog populations in Taiwan

  • Lai, Fang-Yu;Lin, Yu-Chen;Ding, Shih-Torng;Chang, Chi-Sheng;Chao, Wi-Lin;Wang, Pei-Hwa
    • Animal Bioscience
    • /
    • v.35 no.9
    • /
    • pp.1314-1326
    • /
    • 2022
  • Objective: Alongside the rise of animal-protection awareness in Taiwan, the public has been paying more attention to dog genetic deficiencies due to inbreeding in the pet market. The goal of this study was to isolate novel microsatellite markers for monitoring the genetic structure of domestic dog populations in Taiwan. Methods: A total of 113 DNA samples from three dog breeds-beagles (BEs), bichons (BIs), and schnauzers (SCs)-were used in subsequent polymorphic tests applying the 14 novel microsatellite markers that were isolated in this study. Results: The results showed that the high level of genetic diversity observed in these novel microsatellite markers provided strong discriminatory power. The estimated probability of identity (P(ID)) and the probability of identity among sibs (P(ID)sib) for the 14 novel microsatellite markers were 1.7×10-12 and 1.6×10-5, respectively. Furthermore, the power of exclusion for the 14 novel microsatellite markers was 99.98%. The neighbor-joining trees constructed among the three breeds indicated that the 14 sets of novel microsatellite markers were sufficient to correctly cluster the BEs, BIs, and SCs. The principal coordinate analysis plot showed that the dogs could be accurately separated by these 14 loci based on different breeds; moreover, the Beagles from different sources were also distinguished. The first, the second, and the third principal coordinates could be used to explain 44.15%, 26.35%, and 19.97% of the genetic variation. Conclusion: The results of this study could enable powerful monitoring of the genetic structure of domestic dog populations in Taiwan.

Morphological characteristics, chemical and genetic diversity of kenaf (Hibiscus cannabinus L.) genotypes

  • Ryu, Jaihyunk;Kwon, Soon-Jae;Kim, Dong-Gun;Lee, Min-Kyu;Kim, Jung Min;Jo, Yeong Deuk;Kim, Sang Hoon;Jeong, Sang Wook;Kang, Kyung-Yun;Kim, Se Won;Kim, Jin-Baek;Kang, Si-Yong
    • Journal of Plant Biotechnology
    • /
    • v.44 no.4
    • /
    • pp.416-430
    • /
    • 2017
  • The kenaf plant is used widely as food and in traditional folk medicine. This study evaluated the morphological characteristics, functional compounds, and genetic diversity of 32 kenaf cultivars from a worldwide collection. We found significant differences in the functional compounds of leaves from all cultivars, including differences in levels of chlorogenic acid isomer (CAI), chlorogenic acid (CA), kaempferol glucosyl rhamnoside isomer (KGRI), kaempferol rhamnosyl xyloside (KRX), kaemperitrin (KAPT) and total phenols (TPC). The highest TPC, KAPT, CA, and KRX contents were observed in the C22 cultivars. A significant correlation was observed between flowering time and DM yield, seed yield, and four phenolic compounds (KGRI, KRX, CAI, and TPC) (P < 0.01). To assess genetic diversity, we used 80 simple sequence repeats (SSR) primer sets and identified 225 polymorphic loci in the kenaf cultivars. The polymorphism information content and genetic diversity values ranged from 0.11 to 0.79 and 12 to 0.83, with average values of 0.39 and 0.43, respectively. The cluster analysis of the SSR markers showed that the kenaf genotypes could be clearly divided into three clusters based on flowering time. Correlations analysis was conducted for the 80 SSR markers; morphological, chemical and growth traits were found for 15 marker traits (corolla, vein, petal, leaf, stem color, leaf shape, and KGRI content) with significant marker-trait correlations. These results could be used for the selection of kenaf cultivars with improved yield and functional compounds.

Variable Selection for Multi-Purpose Multivariate Data Analysis (다목적 다변량 자료분석을 위한 변수선택)

  • Huh, Myung-Hoe;Lim, Yong-Bin;Lee, Yong-Goo
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.1
    • /
    • pp.141-149
    • /
    • 2008
  • Recently we frequently analyze multivariate data with quite large number of variables. In such data sets, virtually duplicated variables may exist simultaneously even though they are conceptually distinguishable. Duplicate variables may cause problems such as the distortion of principal axes in principal component analysis and factor analysis and the distortion of the distances between observations, i.e. the input for cluster analysis. Also in supervised learning or regression analysis, duplicated explanatory variables often cause the instability of fitted models. Since real data analyses are aimed often at multiple purposes, it is necessary to reduce the number of variables to a parsimonious level. The aim of this paper is to propose a practical algorithm for selection of a subset of variables from a given set of p input variables, by the criterion of minimum trace of partial variances of unselected variables unexplained by selected variables. The usefulness of proposed method is demonstrated in visualizing the relationship between selected and unselected variables, in building a predictive model with very large number of independent variables, and in reducing the number of variables and purging/merging categories in categorical data.

Near infrared spectroscopy for classification of apples using K-mean neural network algorism

  • Muramatsu, Masahiro;Takefuji, Yoshiyasu;Kawano, Sumio
    • Proceedings of the Korean Society of Near Infrared Spectroscopy Conference
    • /
    • 2001.06a
    • /
    • pp.1131-1131
    • /
    • 2001
  • To develop a nondestructive quality evaluation technique of fruits, a K-mean algorism is applied to near infrared (NIR) spectroscopy of apples. The K-mean algorism is one of neural network partition methods and the goal is to partition the set of objects O into K disjoint clusters, where K is assumed to be known a priori. The algorism introduced by Macqueen draws an initial partition of the objects at random. It then computes the cluster centroids, assigns objects to the closest of them and iterates until a local minimum is obtained. The advantage of using neural network is that the spectra at the wavelengths having absorptions against chemical bonds including C-H and O-H types can be selected directly as input data. In conventional multiple regression approaches, the first wavelength is selected manually around the absorbance wavelengths as showing a high correlation coefficient between the NIR $2^{nd}$ derivative spectrum and Brix value with a single regression. After that, the second and following wavelengths are selected statistically as the calibration equation shows a high correlation. Therefore, the second and following wavelengths are selected not in a NIR spectroscopic way but in a statistical way. In this research, the spectra at the six wavelengths including 900, 904, 914, 990, 1000 and 1016nm are selected as input data for K-mean analysis. 904nm is selected because the wavelength shows the highest correlation coefficients and is regarded as the absorbance wavelength. The others are selected because they show relatively high correlation coefficients and are revealed as the absorbance wavelengths against the chemical structures by B. G. Osborne. The experiment was performed with two phases. In first phase, a reflectance was acquired using fiber optics. The reflectance was calculated by comparing near infrared energy reflected from a Teflon sphere as a standard reference, and the $2^{nd}$ derivative spectra were used for K-mean analysis. Samples are intact 67 apples which are called Fuji and cultivated in Aomori prefecture in Japan. In second phase, the Brix values were measured with a commercially available refractometer in order to estimate the result of K-mean approach. The result shows a partition of the spectral data sets of 67 samples into eight clusters, and the apples are classified into samples having high Brix value and low Brix value. Consequently, the K-mean analysis realized the classification of apples on the basis of the Brix values.

  • PDF

Phylogenetic Relationship of Ligularia Species Based on RAPD and ITS Sequences Analyses (RAPD 및 ITS 염기서열 분석을 이용한 곰취 속(Ligularia) 식물의 유연관계 분석)

  • Ahn, Soon-Young;Cho, Kwang-Soo;Yoo, Ki-Oug;Suh, Jong-Taek
    • Horticultural Science & Technology
    • /
    • v.28 no.4
    • /
    • pp.638-647
    • /
    • 2010
  • The genetic relationships in 5 species of $Ligularia$ were investigated using RAPD (Randomly Amplified Polymorphic DNA) and ITS (Internal Transcribed Spacer) sequences analyses. In RAPD analysis, sixty three of 196 arbitrary primers showed polymorphism. The amplified fragments ranged from 0.2 to 1.6 kb in size. The dendrogram was constructed by the UPGMA clustering algorithm based on genetic similarity of RAPD markers. A total of 16 accessions were classified into 5 major groups corresponding each species at the similarity coefficient value of 0.77. In the ITS sequence analysis, the size of ITS 1 was varied from 248 to 256 bp, while ITS 2 was varied from 220 to 222 bp. The 5.8S coding region was 164 bp in lengths. Forty nine sites (10.2%) of the 478 nucleotides were variable, and the G+C content of ITS region ranged from 49.4 to 53.5%. In the ITS tree, five species of $Ligularia$ were monophyletic, and $L.$ $taquetii$ was the first branching within the clade. $Ligularia$ $intermedia$ formed a clade with $L.$ $fischeri$ var. $spiciformis$ (BS=79), and $L.$ $stenocephala$ and $L.$ $fischeri$ were also claded. Two data sets were congruent, except of the position of $L.$ $fischeri$ var. $spiciformis$.

Pattern Recognition of the Herbal Drug, Magnoliae Flos According to their Essential Oil Components

  • Jeong, Eun-Sook;Choi, Kyu-Yeol;Kim, Sun-Chun;Son, In-Seop;Cho, Hwang-Eui;Ahn, Su-Youn;Woo, Mi-Hee;Hong, Jin-Tae;Moon, Dong-Cheul
    • Bulletin of the Korean Chemical Society
    • /
    • v.30 no.5
    • /
    • pp.1121-1126
    • /
    • 2009
  • This paper describes a pattern recognition method of Magnoliae flos based on a gas chromatographic/mass spectrometric (GC/MS) analysis of the essential oil components. The botanical drug is mainly comprised of the four magnolia species (M. denudata, M. biondii, M. kobus, and M. liliflora) in Korea, although some other species are also being dealt with the drug. The GC/MS separation of the volatile components, which was extracted by the simultaneous distillation and extraction (SDE), was performed on a carbowax column (supelcowax 10; 30 m{\time}0.25 mm{\time}0.25{\mu}m$) using temperature programming. Variance in the retention times for all peaks of interests was within RSD 2% for repeated analyses (n = 9). Of the 74 essential oil components identified from the magnolia species, approximately 10 major components, which is $\alpha$-pinene, $\beta$-pinene, sabinene, myrcene, d-limonene, eucarlyptol (1,8-cineol), $\gamma$-terpinene, p-cymene, linalool, $\alpha$-terpineol, were commonly present in the four species. For statistical analysis, the original dataset was reduced to the 13 variables by Fisher criterion and factor analysis (FA). The essential oil patterns were processed by means of the multivariate statistical analysis including hierarchical cluster analysis (HCA), principal component analysis (PCA) and discriminant analysis (DA). All samples were divided into four groups with three principal components by PCA and according to the plant origins by HCA. Thirty-three samples (23 training sets and 10 test samples to be assessed) were correctly classified into the four groups predicted by PCA. This method would provide a practical strategy for assessing the authenticity or quality of the well-known herbal drug, Magnoliae flos.

Syntaxonomy and Synecology of the Robinia pseudoacacia Forests (아까시나무림의 군락분류와 군락생태)

  • Cho, Kwang-Jin;Kim, Jong-Won
    • The Korean Journal of Ecology
    • /
    • v.28 no.1
    • /
    • pp.15-23
    • /
    • 2005
  • The black locust (Robinia pseudoacacia L.) forests were studied by a phytosociological approach. Particular attention was given to characterize the vegetation classification, distribution pattern, and ecological flora of the syntaxa classified. A total of 38 releves were analyzed by using Correlation coefficient, UPGMA as the clustering method, and Principal Coordinates Analysis for ordination. Ecological flora analyzed by plant character sets such as scrambler, annual and biennial plants, forest elements, and actual urbanization index. The analyzed data are based on site-releve matrix with relative net contribution degree (r-NCD) of species. A total of 77 families, 193 genera and 323 species of vascular plants are recorded. Camellino-Robinietum pseudoacaciae ass. nov. and Phragmites-Robinia pseudoacacia community were described. Main cluster and ordination could be separated: 1) urban type, 2) rural type, 3) riparian type, and 4) combined type. It is defined that the Robinietum is a representative unit on the black locust afforestation, Phragmites-Robinia community on the lentic zone in the river ecosystem, and Cameliino-Robinietum ailanthetosum altissimae as an urban forest type. The Robinietum was considered as a perpetual community.

A Study on the Improvement Direction of Selection Evaluation Indicators for the Land Transport Technology Commercialization Support Project: Focusing on the Follow-up Project Linkage Plan (국토교통기술사업화지원사업 선정평가 지표 개선방안 연구: 후속사업 연계 방안을 중심으로)

  • Hyung-Wook Shim;Seok-Ki Cha;Seung-Hee Back
    • Journal of Industrial Convergence
    • /
    • v.20 no.12
    • /
    • pp.87-96
    • /
    • 2022
  • The Ministry of Land, Infrastructure and Transport has also been promoting the commercialization of land transport technology to commercialize the technologies owned by small and medium-sized venture companies, and to support the transfer and commercialization of public technologies. At this point, in order to improve the investment effect of subsequent new projects and to select excellent research institutes, it is necessary to establish a valid evaluation index system suitable for the purpose of the project. The evaluation index system for subsequent new projects should be linked to the project objectives and goals of the preceding project, and should be selected in consideration of existing evaluation indicators to prevent interruption of research results. Therefore, this thesis sets the evaluation index system into multiple scenarios through hierarchical cluster analysis using the evaluation result data for each evaluation committee for small and medium venture companies participating in the land transportation technology commercialization support project, and then analyzes the structural equation model. As a result of scenario analysis, considering the measurement effect of each path representing the causal relationship between evaluation indicators and the effect of each evaluation index on evaluation items, the scenario with the highest impact on the evaluation result was selected as an improvement plan.

Personalized Recommendation System for IPTV using Ontology and K-medoids (IPTV환경에서 온톨로지와 k-medoids기법을 이용한 개인화 시스템)

  • Yun, Byeong-Dae;Kim, Jong-Woo;Cho, Yong-Seok;Kang, Sang-Gil
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.3
    • /
    • pp.147-161
    • /
    • 2010
  • As broadcasting and communication are converged recently, communication is jointed to TV. TV viewing has brought about many changes. The IPTV (Internet Protocol Television) provides information service, movie contents, broadcast, etc. through internet with live programs + VOD (Video on demand) jointed. Using communication network, it becomes an issue of new business. In addition, new technical issues have been created by imaging technology for the service, networking technology without video cuts, security technologies to protect copyright, etc. Through this IPTV network, users can watch their desired programs when they want. However, IPTV has difficulties in search approach, menu approach, or finding programs. Menu approach spends a lot of time in approaching programs desired. Search approach can't be found when title, genre, name of actors, etc. are not known. In addition, inserting letters through remote control have problems. However, the bigger problem is that many times users are not usually ware of the services they use. Thus, to resolve difficulties when selecting VOD service in IPTV, a personalized service is recommended, which enhance users' satisfaction and use your time, efficiently. This paper provides appropriate programs which are fit to individuals not to save time in order to solve IPTV's shortcomings through filtering and recommendation-related system. The proposed recommendation system collects TV program information, the user's preferred program genres and detailed genre, channel, watching program, and information on viewing time based on individual records of watching IPTV. To look for these kinds of similarities, similarities can be compared by using ontology for TV programs. The reason to use these is because the distance of program can be measured by the similarity comparison. TV program ontology we are using is one extracted from TV-Anytime metadata which represents semantic nature. Also, ontology expresses the contents and features in figures. Through world net, vocabulary similarity is determined. All the words described on the programs are expanded into upper and lower classes for word similarity decision. The average of described key words was measured. The criterion of distance calculated ties similar programs through K-medoids dividing method. K-medoids dividing method is a dividing way to divide classified groups into ones with similar characteristics. This K-medoids method sets K-unit representative objects. Here, distance from representative object sets temporary distance and colonize it. Through algorithm, when the initial n-unit objects are tried to be divided into K-units. The optimal object must be found through repeated trials after selecting representative object temporarily. Through this course, similar programs must be colonized. Selecting programs through group analysis, weight should be given to the recommendation. The way to provide weight with recommendation is as the follows. When each group recommends programs, similar programs near representative objects will be recommended to users. The formula to calculate the distance is same as measure similar distance. It will be a basic figure which determines the rankings of recommended programs. Weight is used to calculate the number of watching lists. As the more programs are, the higher weight will be loaded. This is defined as cluster weight. Through this, sub-TV programs which are representative of the groups must be selected. The final TV programs ranks must be determined. However, the group-representative TV programs include errors. Therefore, weights must be added to TV program viewing preference. They must determine the finalranks.Based on this, our customers prefer proposed to recommend contents. So, based on the proposed method this paper suggested, experiment was carried out in controlled environment. Through experiment, the superiority of the proposed method is shown, compared to existing ways.

The aplication of fuzzy classification methods to spatial analysis (공간분석을 위한 퍼지분류의 이론적 배경과 적용에 관한 연구 - 경상남도 邑級以上 도시의 기능분류를 중심으로 -)

  • ;Jung, In-Chul
    • Journal of the Korean Geographical Society
    • /
    • v.30 no.3
    • /
    • pp.296-310
    • /
    • 1995
  • Classification of spatial units into meaningful sets is an important procedure in spatial analysis. It is crucial in characterizing and identifying spatial structures. But traditional classification methods such as cluster analysis require an exact database and impose a clear-cut boundary between classes. Scrutiny of realistic classification problems, however, reveals that available infermation may be vague and that the boundary may be ambiguous. The weakness of conventional methods is that they fail to capture the fuzzy data and the transition between classes. Fuzzy subsets theory is useful for solving these problems. This paper aims to come to the understanding of theoretical foundations of fuzzy spatial analysis, and to find the characteristics of fuzzy classification methods. It attempts to do so through the literature review and the case study of urban classification of the Cities and Eups of Kyung-Nam Province. The main findings are summarized as follows: 1. Following Dubois and Prade, fuzzy information has an imprecise and/or uncertain evaluation. In geography, fuzzy informations about spatial organization, geographical space perception and human behavior are frequent. But the researcher limits his work to numerical data processing and he does not consider spatial fringe. Fuzzy spatial analysis makes it possible to include the interface of groups in classification. 2. Fuzzy numerical taxonomic method is settled by Deloche, Tranquis, Ponsard and Leung. Depending on the data and the method employed, groups derived may be mutually exclusive or they may overlap to a certain degree. Classification pattern can be derived for each degree of similarity/distance $\alpha$. By takina the values of $\alpha$ in ascending or descending order, the hierarchical classification is obtained. 3. Kyung-Nam Cities and Eups were classified by fuzzy discrete classification, fuzzy conjoint classification and cluster analysis according to the ratio of number of persons employed in industries. As a result, they were divided into several groups which had homogeneous characteristies. Fuzzy discrete classification and cluste-analysis give clear-cut boundary, but fuzzy conjoint classification delimit the edges and cores of urban classification. 4. The results of different methods are varied. But each method contributes to the revealing the transparence of spatial structure. Through the result of three kinds of classification, Chung-mu city which has special characteristics and the group of Industrial cities composed by Changwon, Ulsan, Masan, Chinhai, Kimhai, Yangsan, Ungsang, Changsungpo and Shinhyun are evident in common. Even though the appraisal of the fuzzy classification methods, this framework appears to be more realistic and flexible in preserving information pertinent to urban classification.

  • PDF