• Title/Summary/Keyword: Index of Performance Evaluation

Search Result 889, Processing Time 0.031 seconds

Enhancing Predictive Accuracy of Collaborative Filtering Algorithms using the Network Analysis of Trust Relationship among Users (사용자 간 신뢰관계 네트워크 분석을 활용한 협업 필터링 알고리즘의 예측 정확도 개선)

  • Choi, Seulbi;Kwahk, Kee-Young;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.113-127
    • /
    • 2016
  • Among the techniques for recommendation, collaborative filtering (CF) is commonly recognized to be the most effective for implementing recommender systems. Until now, CF has been popularly studied and adopted in both academic and real-world applications. The basic idea of CF is to create recommendation results by finding correlations between users of a recommendation system. CF system compares users based on how similar they are, and recommend products to users by using other like-minded people's results of evaluation for each product. Thus, it is very important to compute evaluation similarities among users in CF because the recommendation quality depends on it. Typical CF uses user's explicit numeric ratings of items (i.e. quantitative information) when computing the similarities among users in CF. In other words, user's numeric ratings have been a sole source of user preference information in traditional CF. However, user ratings are unable to fully reflect user's actual preferences from time to time. According to several studies, users may more actively accommodate recommendation of reliable others when purchasing goods. Thus, trust relationship can be regarded as the informative source for identifying user's preference with accuracy. Under this background, we propose a new hybrid recommender system that fuses CF and social network analysis (SNA). The proposed system adopts the recommendation algorithm that additionally reflect the result analyzed by SNA. In detail, our proposed system is based on conventional memory-based CF, but it is designed to use both user's numeric ratings and trust relationship information between users when calculating user similarities. For this, our system creates and uses not only user-item rating matrix, but also user-to-user trust network. As the methods for calculating user similarity between users, we proposed two alternatives - one is algorithm calculating the degree of similarity between users by utilizing in-degree and out-degree centrality, which are the indices representing the central location in the social network. We named these approaches as 'Trust CF - All' and 'Trust CF - Conditional'. The other alternative is the algorithm reflecting a neighbor's score higher when a target user trusts the neighbor directly or indirectly. The direct or indirect trust relationship can be identified by searching trust network of users. In this study, we call this approach 'Trust CF - Search'. To validate the applicability of the proposed system, we used experimental data provided by LibRec that crawled from the entire FilmTrust website. It consists of ratings of movies and trust relationship network indicating who to trust between users. The experimental system was implemented using Microsoft Visual Basic for Applications (VBA) and UCINET 6. To examine the effectiveness of the proposed system, we compared the performance of our proposed method with one of conventional CF system. The performances of recommender system were evaluated by using average MAE (mean absolute error). The analysis results confirmed that in case of applying without conditions the in-degree centrality index of trusted network of users(i.e. Trust CF - All), the accuracy (MAE = 0.565134) was lower than conventional CF (MAE = 0.564966). And, in case of applying the in-degree centrality index only to the users with the out-degree centrality above a certain threshold value(i.e. Trust CF - Conditional), the proposed system improved the accuracy a little (MAE = 0.564909) compared to traditional CF. However, the algorithm searching based on the trusted network of users (i.e. Trust CF - Search) was found to show the best performance (MAE = 0.564846). And the result from paired samples t-test presented that Trust CF - Search outperformed conventional CF with 10% statistical significance level. Our study sheds a light on the application of user's trust relationship network information for facilitating electronic commerce by recommending proper items to users.

Hierarchical Overlapping Clustering to Detect Complex Concepts (중복을 허용한 계층적 클러스터링에 의한 복합 개념 탐지 방법)

  • Hong, Su-Jeong;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.111-125
    • /
    • 2011
  • Clustering is a process of grouping similar or relevant documents into a cluster and assigning a meaningful concept to the cluster. By this process, clustering facilitates fast and correct search for the relevant documents by narrowing down the range of searching only to the collection of documents belonging to related clusters. For effective clustering, techniques are required for identifying similar documents and grouping them into a cluster, and discovering a concept that is most relevant to the cluster. One of the problems often appearing in this context is the detection of a complex concept that overlaps with several simple concepts at the same hierarchical level. Previous clustering methods were unable to identify and represent a complex concept that belongs to several different clusters at the same level in the concept hierarchy, and also could not validate the semantic hierarchical relationship between a complex concept and each of simple concepts. In order to solve these problems, this paper proposes a new clustering method that identifies and represents complex concepts efficiently. We developed the Hierarchical Overlapping Clustering (HOC) algorithm that modified the traditional Agglomerative Hierarchical Clustering algorithm to allow overlapped clusters at the same level in the concept hierarchy. The HOC algorithm represents the clustering result not by a tree but by a lattice to detect complex concepts. We developed a system that employs the HOC algorithm to carry out the goal of complex concept detection. This system operates in three phases; 1) the preprocessing of documents, 2) the clustering using the HOC algorithm, and 3) the validation of semantic hierarchical relationships among the concepts in the lattice obtained as a result of clustering. The preprocessing phase represents the documents as x-y coordinate values in a 2-dimensional space by considering the weights of terms appearing in the documents. First, it goes through some refinement process by applying stopwords removal and stemming to extract index terms. Then, each index term is assigned a TF-IDF weight value and the x-y coordinate value for each document is determined by combining the TF-IDF values of the terms in it. The clustering phase uses the HOC algorithm in which the similarity between the documents is calculated by applying the Euclidean distance method. Initially, a cluster is generated for each document by grouping those documents that are closest to it. Then, the distance between any two clusters is measured, grouping the closest clusters as a new cluster. This process is repeated until the root cluster is generated. In the validation phase, the feature selection method is applied to validate the appropriateness of the cluster concepts built by the HOC algorithm to see if they have meaningful hierarchical relationships. Feature selection is a method of extracting key features from a document by identifying and assigning weight values to important and representative terms in the document. In order to correctly select key features, a method is needed to determine how each term contributes to the class of the document. Among several methods achieving this goal, this paper adopted the $x^2$�� statistics, which measures the dependency degree of a term t to a class c, and represents the relationship between t and c by a numerical value. To demonstrate the effectiveness of the HOC algorithm, a series of performance evaluation is carried out by using a well-known Reuter-21578 news collection. The result of performance evaluation showed that the HOC algorithm greatly contributes to detecting and producing complex concepts by generating the concept hierarchy in a lattice structure.

How to improve the accuracy of recommendation systems: Combining ratings and review texts sentiment scores (평점과 리뷰 텍스트 감성분석을 결합한 추천시스템 향상 방안 연구)

  • Hyun, Jiyeon;Ryu, Sangyi;Lee, Sang-Yong Tom
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.219-239
    • /
    • 2019
  • As the importance of providing customized services to individuals becomes important, researches on personalized recommendation systems are constantly being carried out. Collaborative filtering is one of the most popular systems in academia and industry. However, there exists limitation in a sense that recommendations were mostly based on quantitative information such as users' ratings, which made the accuracy be lowered. To solve these problems, many studies have been actively attempted to improve the performance of the recommendation system by using other information besides the quantitative information. Good examples are the usages of the sentiment analysis on customer review text data. Nevertheless, the existing research has not directly combined the results of the sentiment analysis and quantitative rating scores in the recommendation system. Therefore, this study aims to reflect the sentiments shown in the reviews into the rating scores. In other words, we propose a new algorithm that can directly convert the user 's own review into the empirically quantitative information and reflect it directly to the recommendation system. To do this, we needed to quantify users' reviews, which were originally qualitative information. In this study, sentiment score was calculated through sentiment analysis technique of text mining. The data was targeted for movie review. Based on the data, a domain specific sentiment dictionary is constructed for the movie reviews. Regression analysis was used as a method to construct sentiment dictionary. Each positive / negative dictionary was constructed using Lasso regression, Ridge regression, and ElasticNet methods. Based on this constructed sentiment dictionary, the accuracy was verified through confusion matrix. The accuracy of the Lasso based dictionary was 70%, the accuracy of the Ridge based dictionary was 79%, and that of the ElasticNet (${\alpha}=0.3$) was 83%. Therefore, in this study, the sentiment score of the review is calculated based on the dictionary of the ElasticNet method. It was combined with a rating to create a new rating. In this paper, we show that the collaborative filtering that reflects sentiment scores of user review is superior to the traditional method that only considers the existing rating. In order to show that the proposed algorithm is based on memory-based user collaboration filtering, item-based collaborative filtering and model based matrix factorization SVD, and SVD ++. Based on the above algorithm, the mean absolute error (MAE) and the root mean square error (RMSE) are calculated to evaluate the recommendation system with a score that combines sentiment scores with a system that only considers scores. When the evaluation index was MAE, it was improved by 0.059 for UBCF, 0.0862 for IBCF, 0.1012 for SVD and 0.188 for SVD ++. When the evaluation index is RMSE, UBCF is 0.0431, IBCF is 0.0882, SVD is 0.1103, and SVD ++ is 0.1756. As a result, it can be seen that the prediction performance of the evaluation point reflecting the sentiment score proposed in this paper is superior to that of the conventional evaluation method. In other words, in this paper, it is confirmed that the collaborative filtering that reflects the sentiment score of the user review shows superior accuracy as compared with the conventional type of collaborative filtering that only considers the quantitative score. We then attempted paired t-test validation to ensure that the proposed model was a better approach and concluded that the proposed model is better. In this study, to overcome limitations of previous researches that judge user's sentiment only by quantitative rating score, the review was numerically calculated and a user's opinion was more refined and considered into the recommendation system to improve the accuracy. The findings of this study have managerial implications to recommendation system developers who need to consider both quantitative information and qualitative information it is expect. The way of constructing the combined system in this paper might be directly used by the developers.

Development of Information Extraction System from Multi Source Unstructured Documents for Knowledge Base Expansion (지식베이스 확장을 위한 멀티소스 비정형 문서에서의 정보 추출 시스템의 개발)

  • Choi, Hyunseung;Kim, Mintae;Kim, Wooju;Shin, Dongwook;Lee, Yong Hun
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.111-136
    • /
    • 2018
  • In this paper, we propose a methodology to extract answer information about queries from various types of unstructured documents collected from multi-sources existing on web in order to expand knowledge base. The proposed methodology is divided into the following steps. 1) Collect relevant documents from Wikipedia, Naver encyclopedia, and Naver news sources for "subject-predicate" separated queries and classify the proper documents. 2) Determine whether the sentence is suitable for extracting information and derive the confidence. 3) Based on the predicate feature, extract the information in the proper sentence and derive the overall confidence of the information extraction result. In order to evaluate the performance of the information extraction system, we selected 400 queries from the artificial intelligence speaker of SK-Telecom. Compared with the baseline model, it is confirmed that it shows higher performance index than the existing model. The contribution of this study is that we develop a sequence tagging model based on bi-directional LSTM-CRF using the predicate feature of the query, with this we developed a robust model that can maintain high recall performance even in various types of unstructured documents collected from multiple sources. The problem of information extraction for knowledge base extension should take into account heterogeneous characteristics of source-specific document types. The proposed methodology proved to extract information effectively from various types of unstructured documents compared to the baseline model. There is a limitation in previous research that the performance is poor when extracting information about the document type that is different from the training data. In addition, this study can prevent unnecessary information extraction attempts from the documents that do not include the answer information through the process for predicting the suitability of information extraction of documents and sentences before the information extraction step. It is meaningful that we provided a method that precision performance can be maintained even in actual web environment. The information extraction problem for the knowledge base expansion has the characteristic that it can not guarantee whether the document includes the correct answer because it is aimed at the unstructured document existing in the real web. When the question answering is performed on a real web, previous machine reading comprehension studies has a limitation that it shows a low level of precision because it frequently attempts to extract an answer even in a document in which there is no correct answer. The policy that predicts the suitability of document and sentence information extraction is meaningful in that it contributes to maintaining the performance of information extraction even in real web environment. The limitations of this study and future research directions are as follows. First, it is a problem related to data preprocessing. In this study, the unit of knowledge extraction is classified through the morphological analysis based on the open source Konlpy python package, and the information extraction result can be improperly performed because morphological analysis is not performed properly. To enhance the performance of information extraction results, it is necessary to develop an advanced morpheme analyzer. Second, it is a problem of entity ambiguity. The information extraction system of this study can not distinguish the same name that has different intention. If several people with the same name appear in the news, the system may not extract information about the intended query. In future research, it is necessary to take measures to identify the person with the same name. Third, it is a problem of evaluation query data. In this study, we selected 400 of user queries collected from SK Telecom 's interactive artificial intelligent speaker to evaluate the performance of the information extraction system. n this study, we developed evaluation data set using 800 documents (400 questions * 7 articles per question (1 Wikipedia, 3 Naver encyclopedia, 3 Naver news) by judging whether a correct answer is included or not. To ensure the external validity of the study, it is desirable to use more queries to determine the performance of the system. This is a costly activity that must be done manually. Future research needs to evaluate the system for more queries. It is also necessary to develop a Korean benchmark data set of information extraction system for queries from multi-source web documents to build an environment that can evaluate the results more objectively.

Prognostic Evaluation of Categorical Platelet-based Indices Using Clustering Methods Based on the Monte Carlo Comparison for Hepatocellular Carcinoma

  • Guo, Pi;Shen, Shun-Li;Zhang, Qin;Zeng, Fang-Fang;Zhang, Wang-Jian;Hu, Xiao-Min;Zhang, Ding-Mei;Peng, Bao-Gang;Hao, Yuan-Tao
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.14
    • /
    • pp.5721-5727
    • /
    • 2014
  • Objectives: To evaluate the performance of clustering methods used in the prognostic assessment of categorical clinical data for hepatocellular carcinoma (HCC) patients in China, and establish a predictable prognostic nomogram for clinical decisions. Materials and Methods: A total of 332 newly diagnosed HCC patients treated with hepatic resection during 2006-2009 were enrolled. Patients were regularly followed up at outpatient clinics. Clustering methods including the Average linkage, k-modes, fuzzy k-modes, PAM, CLARA, protocluster, and ROCK were compared by Monte Carlo simulation, and the optimal method was applied to investigate the clustering pattern of the indices including platelet count, platelet/lymphocyte ratio (PLR) and serum aspartate aminotransferase activity/platelet count ratio index (APRI). Then the clustering variable, age group, tumor size, number of tumor and vascular invasion were studied in a multivariable Cox regression model. A prognostic nomogram was constructed for clinical decisions. Results: The ROCK was best in both the overlapping and non-overlapping cases performed to assess the prognostic value of platelet-based indices. Patients with categorical platelet-based indices significantly split across two clusters, and those with high values, had a high risk of HCC recurrence (hazard ratio [HR] 1.42, 95% CI 1.09-1.86; p<0.01). Tumor size, number of tumor and blood vessel invasion were also associated with high risk of HCC recurrence (all p< 0.01). The nomogram well predicted HCC patient survival at 3 and 5 years. Conclusions: A cluster of platelet-based indices combined with other clinical covariates could be used for prognosis evaluation in HCC.

A Study on the Improvement of Geriatric Sarcopenia by Non-face-to-face Intervention Method (비대면 중재 방법에 따른 노인성 근감소증의 개선에 대한 연구)

  • Myung-Chul Kim;Ju-Hyung Park;Min-Ji Kwon;Beom-Seok Kim;Min-Kyung Park;Seo-Yoon Park;Sung-Jin Park;;Si-Yeon Park;Jung-Hu Park;Joon-Woo Song;Jong-Hyun Yu;Jung-Hyun Lee;Ji-Hyung Lee;Hae-In Kim
    • Journal of The Korean Society of Integrative Medicine
    • /
    • v.12 no.1
    • /
    • pp.49-62
    • /
    • 2024
  • Purpose : This study was conducted to compare two non-face-to-face exercise interventions depending on whether mobile applications and wearable exercise aids are used to find out which interventions are more effective in improving senile sarcopenia. Ultimately, it was conducted to provide basic data for developing non-face-to-face intervention methods to improve sarcopenia. Method : In this study, 18 elderly sarcopenia and possible sarcopenia aged 65 or older were randomly assigned to the digital and self-exercise intervention groups. The digital exercise intervention group performed eight exercise programs with mobile applications and wearable exercise aids to record and manage the elderly performing the programs in real time. And the self-exercise intervention group performed the same program on its own as implemented in the digital exercise group. The intervention was applied for 8 weeks, and before and after the intervention, sarcopenia evaluation and physical function evaluation were performed. Results : In the digital exercise intervention group, arm muscle mass, skeletal muscle index, SPPB, 5TSTS, and BBS were improved, and in the self-exercise intervention group, grip strength, SPPB, 5TSTS, and BBS were improved. Conclusion : It was confirmed that both groups are effective in improving physical performance and physical function, the digital exercise intervention is effective in improving muscle mass and self-exercise intervention is effective in improving muscle strength. Therefore, this study proposes to apply intervention methods separately according to the indicators to improve and prevent sarcopenia, and also simplify the instructions of applications used to improve sarcopenia and to create an environment where users can be trained regularly on how to use it. And, In the future, studies for the development of devices to be designed to help non-face-to-face exercise interventions or studies on the differences between face-to-face and non-face-to-face exercise interventions should be conducted in terms of the effect of improving sarcopenia.

Studies on Boil-off Loss Ratio in the Cocoon Shells of Multivoltine${\times}$Bivoltine Hybrids of Silkworm, Bombyx mori L.

  • Rao, D.Raghavendra;Singh, Ravindra;Premalatha, V.;Sudha, V.N.;Kariappa, B.K.;Dandin, S.B.
    • International Journal of Industrial Entomology and Biomaterials
    • /
    • v.8 no.1
    • /
    • pp.101-106
    • /
    • 2004
  • The process of removal of gummy proteinous material sericin from silk is commonly called as degumming loss or boil-off loss ratio. In the present study, the boil-off loss ratio in the cocoon shells of twelve multivoltine${\times}$bivoltine hybrids and their parents were analysed. Inheritance pattern of boil-off loss ratio was analysed in crosses involving high and low boil-off loss parents, F$_1$s, F$_2$s and back-crosses by parent off spring regression analysis. Heterosis and heterobeltiosis was also analysed for this character, Highly significant (P>0.01) variations were observed in eight out of ten multivoltine and two out of five bivoltine parents indicating the presence of genetic variation in the expression of boil-off loss ratio. Among F$_1$ hybrids, ten hybrids expressed significant (P>0.01) variations when compared with control hybrid PM${\times}$NB$_4$D$_2$. Significant negative heterosis was expressed in three multi ${\times}$ bi hybrids viz., BL67${\times}$CSR$_{101}$, 96A${\times}$CSR$_{19}$ and 96C${\times}$CSR$_{19}$, which is desirable for this character, whereas expression of heterobeltiosis was significant only with one hybrid, 96C${\times}$CSR$_{18}$ in desired direction. Studies on inheritance pattern showed that the character is heritable and contribution percentage of female and male in the ratio of 50.9: 49.1 and it appears that both the parents are influencing in the expression of boil-on loss ratio in silkworm. Based on the overall performance and evaluation by multiple trait evaluation index and also considering the expression of the boil-off loss ratio three hybrids vix., BL67 ${\times}$ CSR$_{101}$, 96A${\times}$CSR$_{19}$ and 96C${\times}$CSR$_{18}$ were found superior and recommended for commercial exploitation.n.ion.n.

A Study on the Efficiency of the Export Support Policy for the SME in Korea (한국 중소기업수출지원정책의 효율화 방안)

  • Choi, Jae-Han
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.3
    • /
    • pp.114-123
    • /
    • 2018
  • Korea's Export Support Policy has shifted from conglomerate to SMEs since the 1998 IMF financial crisis. Therefore the SME export result in 2011 has reached the quantitative growth of more than US$ 100 billion for the first time. However, the trend has remained stagnant since 2013. Such a stagnant is judged to exist on the part of the Export Support Policies that fail to significantly enhance export competitiveness. Therefore, in order to expand the base of the export capabilities of SMEs and enhance the export competitiveness, the researcher has analyzed the problems of the Export Support Policy focused from the major prior studies since 2010 and derived the efficiency improvement methods. The results of this study are as follows: First, it is necessary to select or combine the following measures. they are the coordination or combination of the functions of the export support institutions, the operation of the single export support institutions, the utilization of the cooperative support system between the support institutions, the use of the private enterprises. First, it is necessary to review the following measures: they are the functional adjustment and integration among export support agencies, the adjustment of support organizations by export stage, the role coordinating between the Small and Medium Business Administration and the Local Government. Secondly, it is necessary to build a customized support system for enterprises. Thirdly, in order to secure the manpower and expertise of the support organization, it is necessary to review the utilization of the retired manpower the from the trade companies or the youth intern system. Fourthly, it is suggested that the balanced performance index is required for the export support programs with a certain scale and need to increase the portion the external evaluation together with the quantitative and qualitative evaluation.

The Study of Correlation Between the Balance, Cognition and Activity of Daily Living in Stroke Patients (뇌졸중 환자의 균형, 인지, 일상생활 평가의 상관성 연구)

  • Kang, Bo-Ra;Jeong, Eun-Song;Kim, Jae-Hee;Ha, Yoo-Na
    • Journal of Korean Society of Neurocognitive Rehabilitation
    • /
    • v.10 no.2
    • /
    • pp.45-52
    • /
    • 2018
  • The purpose of the present study was to determine correlations between the Berg Balance Scale (BBS), Montreal Cognitive Assessment-Korean (MoCA-K) and Modified Barthel Index (MBI) targeting stroke patients, and it seeks to analyze the influence among each factor to establish the fundamental research in evaluating the functional performance capability of stroke patients. The study was conducted between December 2017 and March 2018 and the target of the study was 34 stroke patients who are hospitalized and treated in Y rehabilitation hospital located in Goyang city. Following in criteria of how participants were selected. First, a person without the onset of 6months or more. Second, a person who can communicate and score over 20 points on MMSE-K. Third, a person without unilateral neglect. Fourth, a person without lower motor neuron lesion and orthopedic disease on the bilateral lower extremity. Fifth, a person without audiovisual problem and history of using drug or surgery that influence athletic function. sixth, patients who agreed on participating in the study. The evaluation was processed by measuring BBS, MoCA-K, and MBI with the occupational therapist and physical therapist. Also, one assistant was participated in measuring balanced ability for the safety reason. It was found that significantly correlates (p<.01) with BBS and MoCA-K (r=.459), BBS and MBI (r=.550), MoCA-K and MBI (r=.565). This study is meaningful that it provided the basis for the active use of BBS, MoCA-K and MBI as a clinical evaluation tool and its usefulness.

A Study on the Altmetrics of the Papers of Library and Information Science Researchers Published in International Journals (국제 학술지에 발표된 문헌정보학 연구자 논문의 알트메트릭스에 관한 연구)

  • Jane Cho
    • Journal of Korean Library and Information Science Society
    • /
    • v.53 no.4
    • /
    • pp.143-162
    • /
    • 2022
  • Altmetrics is an alternative impact evaluation index that evaluates the social interest in the research performance of individuals or institutions in universities, research institutions, and research fund support institutions. This study empirically analyzed what kind of attention a papers of domestic library and information science researchers published in an international academic journal was receiving in the international community using Altmetric explorer. As a result of the analysis, 230 papers were tracked. The average Altmetric Attention Score (AAS) was 6.63, but there were 2 papers that received overwhelming attention (over 170 points) as they were mentioned in news report and Twitter. Second, there was a tendency for high AAS to appear in cases where a domestic researcher participated as a co-author and the main author belonged to an overseas institution, and in the case where the research funds were supported by foreign government agencies. In addition to the field of the library information science or information system, the papers classified as the field of public health service and education showed high AAS, and it was confirmed that these papers were published in the journals of various fields such as life science. Finally, it was confirmed that there was a weak correlation of r =0.25 between the AAS and the number of citations of the analyzed paper, but a strong correlation of r =0.68 between the number of Mendeley readers and the number of citations.