• Title/Summary/Keyword: Cluster analysis(CF)

Search Result 12, Processing Time 0.031 seconds

Recommender Systems using Structural Hole and Collaborative Filtering (구조적 공백과 협업필터링을 이용한 추천시스템)

  • Kim, Mingun;Kim, Kyoung-Jae
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.107-120
    • /
    • 2014
  • This study proposes a novel recommender system using the structural hole analysis to reflect qualitative and emotional information in recommendation process. Although collaborative filtering (CF) is known as the most popular recommendation algorithm, it has some limitations including scalability and sparsity problems. The scalability problem arises when the volume of users and items become quite large. It means that CF cannot scale up due to large computation time for finding neighbors from the user-item matrix as the number of users and items increases in real-world e-commerce sites. Sparsity is a common problem of most recommender systems due to the fact that users generally evaluate only a small portion of the whole items. In addition, the cold-start problem is the special case of the sparsity problem when users or items newly added to the system with no ratings at all. When the user's preference evaluation data is sparse, two users or items are unlikely to have common ratings, and finally, CF will predict ratings using a very limited number of similar users. Moreover, it may produces biased recommendations because similarity weights may be estimated using only a small portion of rating data. In this study, we suggest a novel limitation of the conventional CF. The limitation is that CF does not consider qualitative and emotional information about users in the recommendation process because it only utilizes user's preference scores of the user-item matrix. To address this novel limitation, this study proposes cluster-indexing CF model with the structural hole analysis for recommendations. In general, the structural hole means a location which connects two separate actors without any redundant connections in the network. The actor who occupies the structural hole can easily access to non-redundant, various and fresh information. Therefore, the actor who occupies the structural hole may be a important person in the focal network and he or she may be the representative person in the focal subgroup in the network. Thus, his or her characteristics may represent the general characteristics of the users in the focal subgroup. In this sense, we can distinguish friends and strangers of the focal user utilizing the structural hole analysis. This study uses the structural hole analysis to select structural holes in subgroups as an initial seeds for a cluster analysis. First, we gather data about users' preference ratings for items and their social network information. For gathering research data, we develop a data collection system. Then, we perform structural hole analysis and find structural holes of social network. Next, we use these structural holes as cluster centroids for the clustering algorithm. Finally, this study makes recommendations using CF within user's cluster, and compare the recommendation performances of comparative models. For implementing experiments of the proposed model, we composite the experimental results from two experiments. The first experiment is the structural hole analysis. For the first one, this study employs a software package for the analysis of social network data - UCINET version 6. The second one is for performing modified clustering, and CF using the result of the cluster analysis. We develop an experimental system using VBA (Visual Basic for Application) of Microsoft Excel 2007 for the second one. This study designs to analyzing clustering based on a novel similarity measure - Pearson correlation between user preference rating vectors for the modified clustering experiment. In addition, this study uses 'all-but-one' approach for the CF experiment. In order to validate the effectiveness of our proposed model, we apply three comparative types of CF models to the same dataset. The experimental results show that the proposed model outperforms the other comparative models. In especial, the proposed model significantly performs better than two comparative modes with the cluster analysis from the statistical significance test. However, the difference between the proposed model and the naive model does not have statistical significance.

Development of Web-based Intelligent Recommender Systems using Advanced Data Mining Techniques (개선된 데이터 마이닝 기술에 의한 웹 기반 지능형 추천시스템 구축)

  • Kim Kyoung-Jae;Ahn Hyunchul
    • Journal of Information Technology Applications and Management
    • /
    • v.12 no.3
    • /
    • pp.41-56
    • /
    • 2005
  • Product recommender system is one of the most popular techniques for customer relationship management. In addition, collaborative filtering (CF) has been known to be one of the most successful recommendation techniques in product recommender systems. However, CF has some limitations such as sparsity and scalability problems. This study proposes hybrid cluster analysis and case-based reasoning (CBR) to address these problems. CBR may relieve the sparsity problem because it recommends products using customer profile and transaction data, but it may still give rise to scalability problem. Thus, this study uses cluster analysis to reduce search space prior to CBR for scalability Problem. For cluster analysis, this study employs hybrid genetic and K-Means algorithms to avoid possibility of convergence in local minima of typical cluster analyses. This study also develops a Web-based prototype system to test the superiority of the proposed model.

  • PDF

Contact oxide etching using $CHF_3/CF_4$ ($CHF_3/CF_4$를 사용한 콘택 산화막 식각)

  • 김창일;김태형;장의구
    • Electrical & Electronic Materials
    • /
    • v.8 no.6
    • /
    • pp.774-779
    • /
    • 1995
  • Process optimization experiments based on the Taguchi method were performed in order to set up the optimal process conditions for the contact oxide etching process module which was built in order to be attached to the cluster system of multi-processing purpose. In order to compare with Taguchi method, the contact oxide etching process carried out with different process parameters(CHF$_{3}$/CF$_{4}$ gas flow rate, chamber pressure, RF power and magnetic field intensity). Optimal etching characteristics were evaluated in terms of etch rate, selectivity, uniformity and etched profile. In this paper, as a final analysis of experimental results the optimal etching characteristics were obtained at the process conditions of CHF3/CF4 gas flow rate = 72/8 sccm, chamber pressure = 50 mTorr, RF power = 500 watts, and magnetic field intensity = 90 gauss.

  • PDF

Characterization of Cytophaga-Flavobacteria Community Structure in the Bering Sea by Cluster-specific 16S rRNA Gene Amplification Analysis

  • Chen, Xihan;Zeng, Yonghui;Jiao, Nianzhi
    • Journal of Microbiology and Biotechnology
    • /
    • v.18 no.2
    • /
    • pp.194-198
    • /
    • 2008
  • A newly designed Cytophaga-Flavobacteria-specific 16S rRNA gene primer pair was employed to investigate the CF community structure in the Bering Sea, revealing a previously unknown and unexpected high CF diversity in this high latitude cold sea. In total, 56 clones were sequenced and 50 unique CF 16 rRNA gene fragments were obtained, clustering into 16 CF subgroups, including nine cosmopolitan subgroups, five psychrophilic subgroups, and two putatively autochthonous subgroups. The majority of sequences (82%) were closely related to uncultured CF species and could not be classified into known CF genera, indicating the presence of a large number of so-far uncultivated CF species in the Bering Sea.

Insights into Systems for Iron-Sulfur Cluster Biosynthesis in Acidophilic Microorganisms

  • Myriam, Perez;Braulio, Paillavil;Javiera, Rivera-Araya;Claudia, Munoz-Villagran;Omar, Orellana;Renato, Chavez;Gloria, Levican
    • Journal of Microbiology and Biotechnology
    • /
    • v.32 no.9
    • /
    • pp.1110-1119
    • /
    • 2022
  • Fe-S clusters are versatile and essential cofactors that participate in multiple and fundamental biological processes. In Escherichia coli, the biogenesis of these cofactors requires either the housekeeping Isc pathway, or the stress-induced Suf pathway which plays a general role under conditions of oxidative stress or iron limitation. In the present work, the Fe-S cluster assembly Isc and Suf systems of acidophilic Bacteria and Archaea, which thrive in highly oxidative environments, were studied. This analysis revealed that acidophilic microorganisms have a complete set of genes encoding for a single system (either Suf or Isc). In acidophilic Proteobacteria and Nitrospirae, a complete set of isc genes (iscRSUAX-hscBA-fdx), but not genes coding for the Suf system, was detected. The activity of the Isc system was studied in Leptospirillum sp. CF-1 (Nitrospirae). RT-PCR experiments showed that eight candidate genes were co-transcribed and conform the isc operon in this strain. Additionally, RT-qPCR assays showed that the expression of the iscS gene was significantly up-regulated in cells exposed to oxidative stress imposed by 260 mM Fe2(SO4)3 for 1 h or iron starvation for 3 h. The activity of cysteine desulfurase (IscS) in CF-1 cell extracts was also upregulated under such conditions. Thus, the Isc system from Leptospirillum sp. CF-1 seems to play an active role in stressful environments. These results contribute to a better understanding of the distribution and role of Fe-S cluster protein biogenesis systems in organisms that thrive in extreme environmental conditions.

Statistical Assessment on the Heavy Metal Variation in the Soils around Abandoned Mine(Case Study for the Samgwang Mine) (폐광산지역 토양 중금속원소들에 대한 통계학적 환경오염 특성평가)

  • Cho, Il-Hyoung;Chun, Suk-Young;Chang, Soon-Woong
    • Journal of Environmental Science International
    • /
    • v.16 no.12
    • /
    • pp.1451-1462
    • /
    • 2007
  • Heavy metal concentrations in the soil were investigated for the abandoned Samkwang metal mine, Cheongyang-Gun, Chungnam Province, Korea. The concentrations of heavy metal(As, Cd, Cu, Ni, Pb, Zn) were determined in mine soils collected at the abandoned mine sites to obtain a general classification and specification of the pollution in this highly polluted region. The results estimated with the normal test and basis statistic on the central tendency and variation showed that the distribution of heavy metal concentration had significantly different at the range of all locations. The range of spatial distribution on the relationship of heavy metal concentration and pH was $4.8{\sim}8.8$ and heavy metal concentration on the type of land use was highest in forest land, and also Ni and Zn in farm and rice field showed the high concentration. The distribution of heavy metal concentration on the depth of a soil showed that the metal concentrations in subsoil were higher than of those in surface soil, while the concentration of Cu and Ni had no significant difference on the depth of soil. Results from the correlation analysis using the data except the extreme and unusual data revel that Zn-Cd(r=0.867), Zn-As(r=0.797), Zn-Pb(r=0.764), Cu-Cd(r=0.673), Cu-As(r=0.614) and Zn-Ni(r=0.605) were the most important parameters in assessing variations of heavy metal in soil. To discriminate pattern differences and similarities among samples, principal factor analysis(PFA) and cluster analysis(CF) were performed using a correlation matrix. This study suggests that PFA and CF techniques are useful tools for identification of important heavy metal and parameters. This study presents the necessity and usefulness of multivariate statistical assessment of complex databases in order to get better information about the quality of soil and gives the basis information to clean up the abandoned mine sites.

Studies on the Optimization of Contact Oxide Etching Process Using Taguchi Method (Taguchi 방법을 사용한 콘택 산호막 식각 공정 최적화 연구)

  • Jeon, Yeong-Jin;Kim, Chang-Il;Gu, Jin-Geun;Yu, Hyeong-Jun
    • Korean Journal of Materials Research
    • /
    • v.5 no.1
    • /
    • pp.63-74
    • /
    • 1995
  • Process optimization experiments based on the Taguchi method were performed in order to set up the optimal process conditions for the contact oxide etching process module which was built in order to be attached to the cluster system of multi-processing purpose. From the two times experiments of Taguchi method, the overall behaviors of the etchmg characteristics depending upon the equipment parameters were understood at the 1st Taguchi experiment, the detail and optimal process conditions were extracted from the 2nd Taguchi experiment. As a final analysis of experimental results, the optimal etching characteristics were obtalned at the process conditions of $CHF_{3}/CF_{4}$ gas flow rate=72/8 sccm, chamber pressure=50 mTorr, RF power=300 Watts, magnetic field intensity=90 Gauss.

  • PDF

Movie Recommendation Algorithm Using Social Network Analysis to Alleviate Cold-Start Problem

  • Xinchang, Khamphaphone;Vilakone, Phonexay;Park, Doo-Soon
    • Journal of Information Processing Systems
    • /
    • v.15 no.3
    • /
    • pp.616-631
    • /
    • 2019
  • With the rapid increase of information on the World Wide Web, finding useful information on the internet has become a major problem. The recommendation system helps users make decisions in complex data areas where the amount of data available is large. There are many methods that have been proposed in the recommender system. Collaborative filtering is a popular method widely used in the recommendation system. However, collaborative filtering methods still have some problems, namely cold-start problem. In this paper, we propose a movie recommendation system by using social network analysis and collaborative filtering to solve this problem associated with collaborative filtering methods. We applied personal propensity of users such as age, gender, and occupation to make relationship matrix between users, and the relationship matrix is applied to cluster user by using community detection based on edge betweenness centrality. Then the recommended system will suggest movies which were previously interested by users in the group to new users. We show shown that the proposed method is a very efficient method using mean absolute error.

The Implementation and Performance Analysis of a OpenCFS Cluster File system (OpenCFS 클러스터 파일 시스템의 구현 및 성능 평가)

  • Jeon, Seung-Hyub;Cha, Gyu-Il;Kim, Jin-Mi;Yoo, Chuck
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.04a
    • /
    • pp.645-647
    • /
    • 2000
  • 본 논문에서는 멀티미디어나 데이터베이스 등 대용량 입출력을 효율적으로 지원하기 위하여 고속 네트웍으로 연결된 클러스터링 환경에서 동작하는 클러스터 파일 시스템인 OpenCFS를 설계하고 구현하여 성능을 평가한다. 구현된 클러스터 파일 시스템은 입출력 장치의 한계를 극복하기 위하여 스트라이핑(striping)기법을 통한 병렬 입출력(parallel I/O)을 수행하고, 능동적으로 시스템 내부 정책 병경을 가능하게 하는 오픈 임플리멘테이션(Open Implementation)방법론을 적용함으로써 응용프로그램의 시스템 내부 정책에 대한 접근 방법을 제공한다. 실험을 통하여 구현된 클러스터 파일 시스템의 성능을 분석한 결과, 사용자가 기존의 프로그래밍 환경을 유지하면서 시스템 내부 정책을 변경함으로써 개선된 성능의 입출력 서비스를 제공 받을 수 있다.

  • PDF

Simulation Study on E-commerce Recommender System by Use of LSI Method (LSI 기법을 이용한 전자상거래 추천자 시스템의 시뮬레이션 분석)

  • Kwon, Chi-Myung
    • Journal of the Korea Society for Simulation
    • /
    • v.15 no.3
    • /
    • pp.23-30
    • /
    • 2006
  • A recommender system for E-commerce site receives information from customers about which products they are interested in, and recommends products that are likely to fit their needs. In this paper, we investigate several methods for large-scale product purchase data for the purpose of producing useful recommendations to customers. We apply the traditional data mining techniques of cluster analysis and collaborative filtering(CF), and CF with reduction of product-dimensionality by use of latent semantic indexing(LSI). If reduced product-dimensionality obtained from LSI shows a similar latent trend of customers for buying products to that based on original customer-product purchase data, we expect less computational effort for obtaining the nearest-neighbor for target customer may improve the efficiency of recommendation performance. From simulation experiments on synthetic customer-product purchase data, CF-based method with reduction of product-dimensionality presents a better performance than the traditional CF methods with respect to the recall, precision and F1 measure. In general, the recommendation quality increases as the size of the neighborhood increases. However, our simulation results shows that, after a certain point, the improvement gain diminish. Also we find, as a number of products of recommendation increases, the precision becomes worse, but the improvement gain of recall is relatively small after a certain point. We consider these informations may be useful in applying recommender system.

  • PDF