• Title/Summary/Keyword: higher-order clustering

Search Result 62, Processing Time 0.03 seconds

Domain Analysis on the Field of Open Access by Co-Word Analysis: Based on Published Journals of Library and Information Science during 2013 to 2018 (동시출현단어 분석을 활용한 오픈액세스 분야의 지적구조 분석: 2013년부터 2018년까지 출판된 문헌정보학 저널을 기반으로)

  • Kim, Sun-Kyum;Kim, Wan-Jong;Seo, Tae-Sul;Choi, Hyun-Jin
    • Journal of Korean Library and Information Science Society
    • /
    • v.50 no.1
    • /
    • pp.333-356
    • /
    • 2019
  • Open access has emerged as an alternative to overcome the crisis brought by scholarly communication on commercial publishers. The purpose of this study is to suggest the intellectual structure that reflects the newest research trend in the field of open access, to identify how the subject area is structured by using co-word analysis, and compare and analyze with the existing study. In order to do this, the total number of dataset was 761 papers collected from Web of Science during the period from January 2012 to November 2018 using information science and 2,321 keywords as a noun phase are extracted from titles and abstracts. To analyze the intellectual structure of open access, 13 topic clusters are extracted by network analysis and the keywords with higher centrallity are drawn by visualizing the intellectual relationship. In addition, after clustering analysis, the relationship was analyzed by plotting the result on the multidimensional scaling map. As a result, it is expected that our research helps the research direction of open access for the future.

Centroid Neural Network with Bhattacharyya Kernel (Bhattacharyya 커널을 적용한 Centroid Neural Network)

  • Lee, Song-Jae;Park, Dong-Chul
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.9C
    • /
    • pp.861-866
    • /
    • 2007
  • A clustering algorithm for Gaussian Probability Distribution Function (GPDF) data called Centroid Neural Network with a Bhattacharyya Kernel (BK-CNN) is proposed in this paper. The proposed BK-CNN is based on the unsupervised competitive Centroid Neural Network (CNN) and employs a kernel method for data projection. The kernel method adopted in the proposed BK-CNN is used to project data from the low dimensional input feature space into higher dimensional feature space so as the nonlinear problems associated with input space can be solved linearly in the feature space. In order to cluster the GPDF data, the Bhattacharyya kernel is used to measure the distance between two probability distributions for data projection. With the incorporation of the kernel method, the proposed BK-CNN is capable of dealing with nonlinear separation boundaries and can successfully allocate more code vector in the region that GPDF data are densely distributed. When applied to GPDF data in an image classification probleml, the experiment results show that the proposed BK-CNN algorithm gives 1.7%-4.3% improvements in average classification accuracy over other conventional algorithm such as k-means, Self-Organizing Map (SOM) and CNN algorithms with a Bhattacharyya distance, classed as Bk-Means, B-SOM, B-CNN algorithms.

Encapsulation and optical properties of Er3+ ions for planar optical amplifiers via sol-gel process (졸-겔법을 이용한 광증폭기의 Er 이온 캡슐화 및 광학적 특성)

  • Kim, Joo-Hyeun;Seok, Sang-Il;Ahn, Bok-Yeop
    • Proceedings of the Materials Research Society of Korea Conference
    • /
    • 2003.11a
    • /
    • pp.135-135
    • /
    • 2003
  • The fast evolution in the fold of optical communication systems demands powerful optical information treatment. These functions can be performed by integrated optical systems. A key component of such systems is erbium doped waveguide amplifier(EDWA). The intra 4f radiative transition of Er at 1.5 $\mu\textrm{m}$ is particularly interesting because this wavelength is standard in optical telecommunications. The fabrication of waveguide amplifier for integrated optics using sol-gel process has received an increasing attention. Potential advantage of lower cost by less capital equipment and easy processing makes this process an attractive alternatives to conventional technologies like flame hydrolysis deposition, ion exchange and chemical vapor deposition, etc. In addition, sol-gel process has been found to be extremely suitable for the control of composition and refractive index related directly with optical properties. The main drawback of such an amplifier with respect to the EDWA is the need for a much higher Er3+ concentration to compensate for the smaller interaction length. However, the high doping of Er might be resulted in the non-radiative relaxation by clustering of Er ions End co-operative upconversion. In order to solve this problem, we investigate the possibility of avoiding short Er-Er distances by encapsulation of Er3+ ions in hosts such as organic-inorganic hybrid materials. For inorganic-organic hybrid sols, methacryloxypropyltrimethoxysilane (MPTS), zirconyl chloride octahydrate and erbium(III) chloride hexahydrate were used as starting materials, followed by conventional sol-gel process. It was observed by TEM that nano sols having core/shell toplology were formed, depending on the mole ratio of Zr/Er. The surface roughness for the coatings on Si substrate was investigated by AFM as a function of Zr/Er ratio. The local environment and vibrational Properties of Er3+ ions were studied using Near-IR, FT-IR, and UV/Vis spectroscopy. Nano hybrid coatings derived from polymer and Er doped encapsulation Eave the good luminescence at 1.55$\mu\textrm{m}$.

  • PDF

Spatial Clustering Analysis of Fire in Gangwon-Do (강원도 화재의 공간적 군집 특성 분석)

  • BAE, Sun-Hak
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.21 no.3
    • /
    • pp.93-103
    • /
    • 2018
  • The purpose of this study is to analyze the spatial cluster characteristics of fire using long-term fire data. For this, fire data which were broke out in the last 40 years were converted into GIS data and spatial analysis was performed at Gangwon-do province's minimum administrative district level. In order to grasp the spatial distribution of the fire, Moran's I, Geary's Ci and Getis-Ord's Gi*, which are methods that analyze the local indicators of spatial association(LISA), were used. By integrating the characteristics of the spatial distribution of fire by integrating the results obtained from each analysis, the advantages of the individual analysis methods were reflected in the study results. As a result of the study, hotspot areas of fire in Gangwon-do was derived out. Among the hot spot areas, some areas, where the fire frequency is higher than the adjacent areas, have been identified. The results of this study can be used as information for predicting the fire hazard area and relocating of fire-fighting facilities in the study area.

Genetic Distances of Crucian Carp Populations analyzed by PCR Approach

  • Jeon, Jun-Hyub;Yoon, Jong-Man
    • Development and Reproduction
    • /
    • v.20 no.2
    • /
    • pp.135-140
    • /
    • 2016
  • Genomic DNAs isolated from crucian carp of four rivers, belonging to the family Cyprinidae was amplified by seven oligonucleotides primers. In the present study, we employed hierarchical clustering method in order to reveal genetic distances and variations. Crucian carp was acquired from Hangang river (CAH), Geumgang river (CAG), Nakdonggang river (CAN) and Yeongsangang river (CAY). The primer BION-12 generated the most loci (a total of 50) with an average of 10 in the CAY population. The primer BION-10 generated the least loci (a total of 19), with an average of 3.8 in the CAG population, in comparison to the other primers used. Seven oligonucleotides primers made 16.7 average no. per primer of specific loci in the CAH population, 7.4 in the CAG population, 8.6 in the CAN population and 0.9 in the CAY population, respectively. The specific loci generated by oligonucleotides primers revealed inter-individual-specific characteristics, thus disclosing DNA polymorphisms. The dendrogram obtained by the seven oligonucleotides primers indicates four genetic clusters. The genetic distance that displayed significant molecular differences was between individuals no.06 and no.08 from the CAG population (genetic distance = 0.036), while the genetic distance among the five individuals that displayed significant molecular differences was between individuals no.08 and no.09 from the CAG population (genetic distance = 0.088). With regard to average bandsharing value (BS) results, individuals from CAY population ($0.985{\pm}0.009$) exhibited higher bandsharing values than did individuals from CAH population ($0.779{\pm}0.049$) (P<0.05). Relatively, individuals of CAY population were fairly closely related to that of CAN location (genetic distance between two populations<0.016).

Genetic Distances of Three Mollusk Species Investigated by PCR Analysis

  • Oh, Hyun;Yoon, Jong-Man
    • Development and Reproduction
    • /
    • v.18 no.1
    • /
    • pp.43-49
    • /
    • 2014
  • Three species of Nortamea concinua (NC) and Haliotis discus hannai (HDH) from Tongyeong and Sulculus diversicolor supertexta (SDS) are widely distributed on the coast of the Yellow Sea, southern sea and Jeju Island in the Korean Peninsula under the innate ecosystem. There is a need to understand the genetic traits and composition of three mollusk species in order to evaluate exactly the patent genetic effect. PCR analysis was performed on DNA samples extracted from a total of 21 individuals using seven decamer oligonucleotides primers. Seven primers were shown to generate the unique shared loci to each species and shared loci by the three species which could be clearly scored. A hierarchical clustering tree was constructed using similarity matrices to generate a dendrogram, which was facilitated by the Systat version 10. 236 specific loci, with an average of 56.3 per primer, were identified in the NC species. 142 specific loci, with an average of 44.7 per primer, were identified in the HDH species. Especially, 126 numbers of shared loci by the three species, with an average of 18 per primer, were observed among the three species. Especially, the decamer primer BION-75 generated 7 unique loci to each species, which were identifying each species, in 700 bp NC species. Interestingly, the primer BION-50detected 42 shared loci by the three species, major and/or minor fragments of sizes 100 bp and 150 bp, respectively, which were identical in all samples. As regards average bandsharing value (BS) results, individuals from HDH species (0.772) exhibited higher bandsharing values than did individuals from NC species (0.655). In this study, the dendrogram obtained by the seven decamer primers indicates three genetic clusters: cluster 1 (CONCINNA 01~CONCINNA 07), cluster 2 (HANNAI 08~HANNAI 14), cluster 3 (SUPERTEXTA 15~SUPERTEXTA 21). Comparatively, individuals of HDH species were fairly closely related to that of SDS species, as shown in the hierarchical dendrogram of genetic distances.

A Methodology of Customer Churn Prediction based on Two-Dimensional Loyalty Segmentation (이차원 고객충성도 세그먼트 기반의 고객이탈예측 방법론)

  • Kim, Hyung Su;Hong, Seung Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.4
    • /
    • pp.111-126
    • /
    • 2020
  • Most industries have recently become aware of the importance of customer lifetime value as they are exposed to a competitive environment. As a result, preventing customers from churn is becoming a more important business issue than securing new customers. This is because maintaining churn customers is far more economical than securing new customers, and in fact, the acquisition cost of new customers is known to be five to six times higher than the maintenance cost of churn customers. Also, Companies that effectively prevent customer churn and improve customer retention rates are known to have a positive effect on not only increasing the company's profitability but also improving its brand image by improving customer satisfaction. Predicting customer churn, which had been conducted as a sub-research area for CRM, has recently become more important as a big data-based performance marketing theme due to the development of business machine learning technology. Until now, research on customer churn prediction has been carried out actively in such sectors as the mobile telecommunication industry, the financial industry, the distribution industry, and the game industry, which are highly competitive and urgent to manage churn. In addition, These churn prediction studies were focused on improving the performance of the churn prediction model itself, such as simply comparing the performance of various models, exploring features that are effective in forecasting departures, or developing new ensemble techniques, and were limited in terms of practical utilization because most studies considered the entire customer group as a group and developed a predictive model. As such, the main purpose of the existing related research was to improve the performance of the predictive model itself, and there was a relatively lack of research to improve the overall customer churn prediction process. In fact, customers in the business have different behavior characteristics due to heterogeneous transaction patterns, and the resulting churn rate is different, so it is unreasonable to assume the entire customer as a single customer group. Therefore, it is desirable to segment customers according to customer classification criteria, such as loyalty, and to operate an appropriate churn prediction model individually, in order to carry out effective customer churn predictions in heterogeneous industries. Of course, in some studies, there are studies in which customers are subdivided using clustering techniques and applied a churn prediction model for individual customer groups. Although this process of predicting churn can produce better predictions than a single predict model for the entire customer population, there is still room for improvement in that clustering is a mechanical, exploratory grouping technique that calculates distances based on inputs and does not reflect the strategic intent of an entity such as loyalties. This study proposes a segment-based customer departure prediction process (CCP/2DL: Customer Churn Prediction based on Two-Dimensional Loyalty segmentation) based on two-dimensional customer loyalty, assuming that successful customer churn management can be better done through improvements in the overall process than through the performance of the model itself. CCP/2DL is a series of churn prediction processes that segment two-way, quantitative and qualitative loyalty-based customer, conduct secondary grouping of customer segments according to churn patterns, and then independently apply heterogeneous churn prediction models for each churn pattern group. Performance comparisons were performed with the most commonly applied the General churn prediction process and the Clustering-based churn prediction process to assess the relative excellence of the proposed churn prediction process. The General churn prediction process used in this study refers to the process of predicting a single group of customers simply intended to be predicted as a machine learning model, using the most commonly used churn predicting method. And the Clustering-based churn prediction process is a method of first using clustering techniques to segment customers and implement a churn prediction model for each individual group. In cooperation with a global NGO, the proposed CCP/2DL performance showed better performance than other methodologies for predicting churn. This churn prediction process is not only effective in predicting churn, but can also be a strategic basis for obtaining a variety of customer observations and carrying out other related performance marketing activities.

Keyword Network Analysis for Technology Forecasting (기술예측을 위한 특허 키워드 네트워크 분석)

  • Choi, Jin-Ho;Kim, Hee-Su;Im, Nam-Gyu
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.227-240
    • /
    • 2011
  • New concepts and ideas often result from extensive recombination of existing concepts or ideas. Both researchers and developers build on existing concepts and ideas in published papers or registered patents to develop new theories and technologies that in turn serve as a basis for further development. As the importance of patent increases, so does that of patent analysis. Patent analysis is largely divided into network-based and keyword-based analyses. The former lacks its ability to analyze information technology in details while the letter is unable to identify the relationship between such technologies. In order to overcome the limitations of network-based and keyword-based analyses, this study, which blends those two methods, suggests the keyword network based analysis methodology. In this study, we collected significant technology information in each patent that is related to Light Emitting Diode (LED) through text mining, built a keyword network, and then executed a community network analysis on the collected data. The results of analysis are as the following. First, the patent keyword network indicated very low density and exceptionally high clustering coefficient. Technically, density is obtained by dividing the number of ties in a network by the number of all possible ties. The value ranges between 0 and 1, with higher values indicating denser networks and lower values indicating sparser networks. In real-world networks, the density varies depending on the size of a network; increasing the size of a network generally leads to a decrease in the density. The clustering coefficient is a network-level measure that illustrates the tendency of nodes to cluster in densely interconnected modules. This measure is to show the small-world property in which a network can be highly clustered even though it has a small average distance between nodes in spite of the large number of nodes. Therefore, high density in patent keyword network means that nodes in the patent keyword network are connected sporadically, and high clustering coefficient shows that nodes in the network are closely connected one another. Second, the cumulative degree distribution of the patent keyword network, as any other knowledge network like citation network or collaboration network, followed a clear power-law distribution. A well-known mechanism of this pattern is the preferential attachment mechanism, whereby a node with more links is likely to attain further new links in the evolution of the corresponding network. Unlike general normal distributions, the power-law distribution does not have a representative scale. This means that one cannot pick a representative or an average because there is always a considerable probability of finding much larger values. Networks with power-law distributions are therefore often referred to as scale-free networks. The presence of heavy-tailed scale-free distribution represents the fundamental signature of an emergent collective behavior of the actors who contribute to forming the network. In our context, the more frequently a patent keyword is used, the more often it is selected by researchers and is associated with other keywords or concepts to constitute and convey new patents or technologies. The evidence of power-law distribution implies that the preferential attachment mechanism suggests the origin of heavy-tailed distributions in a wide range of growing patent keyword network. Third, we found that among keywords that flew into a particular field, the vast majority of keywords with new links join existing keywords in the associated community in forming the concept of a new patent. This finding resulted in the same outcomes for both the short-term period (4-year) and long-term period (10-year) analyses. Furthermore, using the keyword combination information that was derived from the methodology suggested by our study enables one to forecast which concepts combine to form a new patent dimension and refer to those concepts when developing a new patent.

Utilization of Information from International Observation Trials for the Introduction of New Crops: An Introduction of Azuki Bean Varieties from China to Thailand

  • Xin, Chen;Volkaert, Hugo;Chatwachirawong, Prasert;Srinives, Peerasak
    • Journal of Crop Science and Biotechnology
    • /
    • v.11 no.1
    • /
    • pp.51-56
    • /
    • 2008
  • Azuki bean has never been commercially grown in Thailand, due in part to a lack of suitable varieties. A core collection of 114 azuki bean accessions, originally from different parts of China(northern, central, southern) representing the germplasm of Chinese land races, were evaluated in the experimental field of the Institute of Vegetable Crops, Jiangsu Academy of Agricultural Sciences, China from June to October 2004. The same experiment was repeated at Kamphaeng Saen campus of Kasetsart University, Thailand from February to May 2005. Yield, yield components, and agronomic traits were recorded in all accessions in order to identify certain genotypes for further investigation. The statistical parameters that were used as indicators of phenotypic variation were mean, coefficient of variability(CV), correlation coefficient(r), range, mean difference, and phenotypic clustering of the accessions. The results indicated that the azuki bean varieties planted in Kamphaeng Saen were shorter, earlier in growing duration, and lower in plant height, seed yield per plant, 100-seed weight, and pods per plant as compared to when they were grown in China. This discrepancy was caused largely by the combined effect of temperature, rainfall, and day length. The traits that were rather stable in both locations were branches per plant and seeds per pod. Azuki bean varieties from northern China showed higher response to the changing environments compared with those from central and southern China. Some agronomic traits showed high correlation coefficient between the environments in Thailand and China. The CV of agronomic traits in both locations were ranked in descending order as follows: seed yield per plant, pods per plant, branches per plant, plant height, 100-seed weight, seeds per pod, and growing duration. The CV of seeds per pod and branches per plant were almost the same in both locations. Yield per plant in China correlated well(r=0.75) with pods per plant, but not with the other traits. Based on their response to both environments, the azuki bean accessions can be broadly divided into four groups, viz. northern 1, northern 2, central, and southern. This implied that there was more diversity, but probably less stability among the accessions originating from northern China.

  • PDF

Evaluation of the Risk of Metabolic Syndrome for the Young Adults in Korean Students of a University (한국인 대학생군 대상의 청.장년층 대사이상증후군 위험성 평가)

  • Chung, Jae-Hun;Lee, Bo-Reum;Lim, Sung-Jin;Jang, Je-Kwan;Lee, Myung-Koo;Lee, Chong-Kil;Lim, Sung-Cil
    • YAKHAK HOEJI
    • /
    • v.53 no.1
    • /
    • pp.19-24
    • /
    • 2009
  • Metabolic syndrome, defined as the clustering of several metabolic disorders including obesity (waist circumference ${\geq}90$ if male or ${\geq}80$ if female, cm), dyslipidemia ($TG{\geq}150$ or HDL-C<40 if male or <50 if female, mg/dl), hypertension ($BP{\geq}130/85mmHg$) and hyperglycemia (fasting plasma $glucose{\geq}110mg/dl$), increases the cardiovascular risk of the general population. Recently, risk of this syndrome arises in young adults world widely. Therefore, we randomly selected and evaluated the risk of metabolic syndrome of total 43 people (group I-22, group II-21) for 2 years. Group I was 22 peoples (15 males, 7 females) with age of 22 thru 35 year old (average 28 year old) and group II was 21 people (19 male, 2 female) with age of 22 thur 32 years old (average 24 year old) in Cheongju area from March 1st thru 30th of 2008 in Cheongju area from September 1st thru 30th of 2007 in order to find out how serious this phenomenon is in young adult of Korea. 13.95% (n=7) of total people has a metabolic syndrome by NCEP/ATPIII definition among this group (group I-6, group II-1). Those of 6 have 3 or over risk factor for metabolic syndrome such as obesity, hypertension, fasting blood glucose and hypetriglyceridemia at the same time (group I-5, group II-1). Group I have more risk factor because of more higher age than group II. Therefore we need aggressively to monitor and provide them for early diagnosis, educational programs and assistance for lifestyle changes in order to prevent metabolic syndrome among young adults.