• Title/Summary/Keyword: Group-Key

Search Result 2,561, Processing Time 0.031 seconds

Resolving the 'Gray sheep' Problem Using Social Network Analysis (SNA) in Collaborative Filtering (CF) Recommender Systems (소셜 네트워크 분석 기법을 활용한 협업필터링의 특이취향 사용자(Gray Sheep) 문제 해결)

  • Kim, Minsung;Im, Il
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.137-148
    • /
    • 2014
  • Recommender system has become one of the most important technologies in e-commerce in these days. The ultimate reason to shop online, for many consumers, is to reduce the efforts for information search and purchase. Recommender system is a key technology to serve these needs. Many of the past studies about recommender systems have been devoted to developing and improving recommendation algorithms and collaborative filtering (CF) is known to be the most successful one. Despite its success, however, CF has several shortcomings such as cold-start, sparsity, gray sheep problems. In order to be able to generate recommendations, ordinary CF algorithms require evaluations or preference information directly from users. For new users who do not have any evaluations or preference information, therefore, CF cannot come up with recommendations (Cold-star problem). As the numbers of products and customers increase, the scale of the data increases exponentially and most of the data cells are empty. This sparse dataset makes computation for recommendation extremely hard (Sparsity problem). Since CF is based on the assumption that there are groups of users sharing common preferences or tastes, CF becomes inaccurate if there are many users with rare and unique tastes (Gray sheep problem). This study proposes a new algorithm that utilizes Social Network Analysis (SNA) techniques to resolve the gray sheep problem. We utilize 'degree centrality' in SNA to identify users with unique preferences (gray sheep). Degree centrality in SNA refers to the number of direct links to and from a node. In a network of users who are connected through common preferences or tastes, those with unique tastes have fewer links to other users (nodes) and they are isolated from other users. Therefore, gray sheep can be identified by calculating degree centrality of each node. We divide the dataset into two, gray sheep and others, based on the degree centrality of the users. Then, different similarity measures and recommendation methods are applied to these two datasets. More detail algorithm is as follows: Step 1: Convert the initial data which is a two-mode network (user to item) into an one-mode network (user to user). Step 2: Calculate degree centrality of each node and separate those nodes having degree centrality values lower than the pre-set threshold. The threshold value is determined by simulations such that the accuracy of CF for the remaining dataset is maximized. Step 3: Ordinary CF algorithm is applied to the remaining dataset. Step 4: Since the separated dataset consist of users with unique tastes, an ordinary CF algorithm cannot generate recommendations for them. A 'popular item' method is used to generate recommendations for these users. The F measures of the two datasets are weighted by the numbers of nodes and summed to be used as the final performance metric. In order to test performance improvement by this new algorithm, an empirical study was conducted using a publically available dataset - the MovieLens data by GroupLens research team. We used 100,000 evaluations by 943 users on 1,682 movies. The proposed algorithm was compared with an ordinary CF algorithm utilizing 'Best-N-neighbors' and 'Cosine' similarity method. The empirical results show that F measure was improved about 11% on average when the proposed algorithm was used

    . Past studies to improve CF performance typically used additional information other than users' evaluations such as demographic data. Some studies applied SNA techniques as a new similarity metric. This study is novel in that it used SNA to separate dataset. This study shows that performance of CF can be improved, without any additional information, when SNA techniques are used as proposed. This study has several theoretical and practical implications. This study empirically shows that the characteristics of dataset can affect the performance of CF recommender systems. This helps researchers understand factors affecting performance of CF. This study also opens a door for future studies in the area of applying SNA to CF to analyze characteristics of dataset. In practice, this study provides guidelines to improve performance of CF recommender systems with a simple modification.

  • A Study on the Realities and the Subject of Environmental Management for Small and Medium-Sized Companies in Gangwon Area (강원지역 중소기업의 환경경영 실태와 과제)

    • Jeon, Yeong-Seung;Park, Eun-Jeong
      • Korean Business Review
      • /
      • v.17
      • /
      • pp.53-81
      • /
      • 2004
    • The purpose of this study is to understand the realities and the subject of environmental management for small and medium-sized companies in Gwangwon area, through surveying the present status as to acquiring the certification of ISO14001, and to seek for a plan to facilitate environmental management. Given summarizing key results, those are as follows. First, while the number of companies in our country which acquired the certification of ISO14001, amounts to 1,215 businesses as of April of 2003, the number of small and medium-sized companies in Gwangwon area which obtained the certification of ISO14001 reached only 26 businesses, the lowest level among metropolitan municipalities. Second, for the reason that companies who didn't acquire the certification, strive not to receive the certification, it did present the point that' costs to be needed in acquiring and maintaining the certification are larger than practical benefit. Third, the biggest reason for either companies which did not acquire the certification of ISO14001 or companies which did (try to) acquire the certification of ISO1400, was, enhancement of a corporate image,' and the effect after a company who obtained the certification introduced the environmental management system, was also shown to be 'the improvement of a corporate image.' Fourth, many companies who acquired the certification of ISO1400 pointed out the response related to 'burden on document creation and costs' and 'lack of manpower' as problems when introducing the environmental management system. On the basis of major results of a study as the above, given presenting the subject and a plan for activating the environmental management of small and medium-sized companies in Gwangwon area, those are as follows. First, because most of companies who did not obtain the certification of ISO1400 have low recognition of ISO14001, it needs continuous and positive publicity, education and a training system. Second, it requires to carry out an educational program to nurture professional manpower due to lack of manpower relevant to environmental management, to expand payment of subsidies, to open exclusive-charge department and consulting contact, to have the relevant information be database and to develop software. Third, in order to make the certification obtained through inexpensive costs and simple procedures, it needs to positively consider the creation of public approval system for a small and medium-sized company, group approval system, industrial-complex approval system, and others.

    • PDF

    Development of Korean Healthy Eating Index for adults using the Korea National Health and Nutrition Examination Survey data (국민건강영양조사 자료를 이용한 한국 성인의 식생활평가지수 개발)

    • Yook, Sung-Min;Park, Sohee;Moon, Hyun-Kyung;Kim, Kirang;Shim, Jae Eun;Hwang, Ji-Yun
      • Journal of Nutrition and Health
      • /
      • v.48 no.5
      • /
      • pp.419-428
      • /
      • 2015
    • Purpose: The current study was conducted in order to develop the Korean Healthy Eating Index (KHEI) for assessing adherence to national dietary guidelines and comprehensive diet quality of healthy Korean adults using the 5th Korea National Health and Nutrition Examination Survey (KNHANES) data. Methods: The candidate components of KHEI were selected based on literature reviews, dietary guidelines for Korean adults, 2010 Dietary Reference Intakes for Koreans (2010 KDRI), and objectives of HP 2020. The associations between candidate components and risk of obesity, abdominal obesity, and metabolic syndrome were assessed using the 5th KNHANES data. The expert review process was also performed. Results: Diets that meet the food group recommendations per each energy level receive maximum scores for the 9 adequacy components of the index. Scores for amounts between zero and the standard are prorated linearly. For the three moderation components among the total of five, population probability densities were examined when setting the standards for minimum and maximum scores. Maximum scores for the total of 14 components are 100 points and each component has maximum scores of 5 (fruit intakes excluding juice, fruit intake including juice, vegetable intakes excluding Kimchi and pickles, vegetable intake including Kimchi or pickles, ratio of white meat to read meat, whole grains intake, refined grains intake, and percentages of energy intake from carbohydrate) or 10 points (protein foods intake, milk and dairy food intake, having breakfast, sodium intake, percentages of energy intake from empty calorie foods, and percentages of energy intake from fat). The KHEI is a measure of diet quality as specified by the key diet recommendations of the dietary guidelines and 2010 KDRIs. Conclusion: The KHEI will be used as a tool for monitoring diet quality of the Korean population and subpopulations, evaluation of nutrition interventions and research.

    Comparison of Association Rule Learning and Subgroup Discovery for Mining Traffic Accident Data (교통사고 데이터의 마이닝을 위한 연관규칙 학습기법과 서브그룹 발견기법의 비교)

    • Kim, Jeongmin;Ryu, Kwang Ryel
      • Journal of Intelligence and Information Systems
      • /
      • v.21 no.4
      • /
      • pp.1-16
      • /
      • 2015
    • Traffic accident is one of the major cause of death worldwide for the last several decades. According to the statistics of world health organization, approximately 1.24 million deaths occurred on the world's roads in 2010. In order to reduce future traffic accident, multipronged approaches have been adopted including traffic regulations, injury-reducing technologies, driving training program and so on. Records on traffic accidents are generated and maintained for this purpose. To make these records meaningful and effective, it is necessary to analyze relationship between traffic accident and related factors including vehicle design, road design, weather, driver behavior etc. Insight derived from these analysis can be used for accident prevention approaches. Traffic accident data mining is an activity to find useful knowledges about such relationship that is not well-known and user may interested in it. Many studies about mining accident data have been reported over the past two decades. Most of studies mainly focused on predict risk of accident using accident related factors. Supervised learning methods like decision tree, logistic regression, k-nearest neighbor, neural network are used for these prediction. However, derived prediction model from these algorithms are too complex to understand for human itself because the main purpose of these algorithms are prediction, not explanation of the data. Some of studies use unsupervised clustering algorithm to dividing the data into several groups, but derived group itself is still not easy to understand for human, so it is necessary to do some additional analytic works. Rule based learning methods are adequate when we want to derive comprehensive form of knowledge about the target domain. It derives a set of if-then rules that represent relationship between the target feature with other features. Rules are fairly easy for human to understand its meaning therefore it can help provide insight and comprehensible results for human. Association rule learning methods and subgroup discovery methods are representing rule based learning methods for descriptive task. These two algorithms have been used in a wide range of area from transaction analysis, accident data analysis, detection of statistically significant patient risk groups, discovering key person in social communities and so on. We use both the association rule learning method and the subgroup discovery method to discover useful patterns from a traffic accident dataset consisting of many features including profile of driver, location of accident, types of accident, information of vehicle, violation of regulation and so on. The association rule learning method, which is one of the unsupervised learning methods, searches for frequent item sets from the data and translates them into rules. In contrast, the subgroup discovery method is a kind of supervised learning method that discovers rules of user specified concepts satisfying certain degree of generality and unusualness. Depending on what aspect of the data we are focusing our attention to, we may combine different multiple relevant features of interest to make a synthetic target feature, and give it to the rule learning algorithms. After a set of rules is derived, some postprocessing steps are taken to make the ruleset more compact and easier to understand by removing some uninteresting or redundant rules. We conducted a set of experiments of mining our traffic accident data in both unsupervised mode and supervised mode for comparison of these rule based learning algorithms. Experiments with the traffic accident data reveals that the association rule learning, in its pure unsupervised mode, can discover some hidden relationship among the features. Under supervised learning setting with combinatorial target feature, however, the subgroup discovery method finds good rules much more easily than the association rule learning method that requires a lot of efforts to tune the parameters.

    The Study on the Priority of First Person Shooter game Elements using Delphi Methodology (FPS게임 구성요소의 중요도 분석방법에 관한 연구 1 -델파이기법을 이용한 독립요소의 계층설계와 검증을 중심으로-)

    • Bae, Hye-Jin;Kim, Suk-Tae
      • Archives of design research
      • /
      • v.20 no.3 s.71
      • /
      • pp.61-72
      • /
      • 2007
    • Having started with "Space War", the first game produced by MIT in the 1960's, the gaming industry expanded rapidly and grew to a large size over a short period of time: the brand new games being launched on the market are found to contain many different elements making up a single content in that it is often called the 'the most comprehensive ultimate fruits' of the design technologies. This also translates into a large increase in the number of things which need to be considered in developing games, complicating the plans on the financial budget, the work force, and the time to be committed. Therefore, an approach for analyzing the elements which make up a game, computing the importance of each of them, and assessing those games to be developed in the future, is the key to a successful development of games. Many decision-making activities are often required under such a planning process. The decision-making task involves many difficulties which are outlined as follows: the multi-factor problem; the uncertainty problem impeding the elements from being "quantified" the complex multi-purpose problem for which the outcome aims confusion among decision-makers and the problem with determining the priority order of multi-stages leading to the decision-making process. In this study we plan to suggest AHP (Analytic Hierarchy Process) so that these problems can be worked out comprehensively, and logical and rational alternative plan can be proposed through the quantification of the "uncertain" data. The analysis was conducted by taking FPS (First Person Shooting) which is currently dominating the gaming industry, as subjects for this study. The most important consideration in conducting AHP analysis is to accurately group the elements of the subjects to be analyzed objectively, and arrange them hierarchically, and to analyze the importance through pair-wise comparison between the elements. The study is composed of 2 parts of analyzing these elements and computing the importance between them, and choosing an alternative plan. Among these this paper is particularly focused on the Delphi technique-based objective element analyzing and hierarchy of the FPS games.

    • PDF

    Overview of Research Trends in Estimation of Forest Carbon Stocks Based on Remote Sensing and GIS (원격탐사와 GIS 기반의 산림탄소저장량 추정에 관한 주요국 연구동향 개관)

    • Kim, Kyoung-Min;Lee, Jung-Bin;Kim, Eun-Sook;Park, Hyun-Ju;Roh, Young-Hee;Lee, Seung-Ho;Park, Key-Ho;Shin, Hyu-Seok
      • Journal of the Korean Association of Geographic Information Studies
      • /
      • v.14 no.3
      • /
      • pp.236-256
      • /
      • 2011
    • Forest carbon stocks change due to land use change is an important data required by UNFCCC(United Nations framework convention on climate change). Spatially explicit estimation of forest carbon stocks based on IPCC GPG(intergovernmental panel on climate change good practice guidance) tier 3 gives high reliability. But a current estimation which was aggregated from NFI data doesn't have detail forest carbon stocks by polygon or cell. In order to improve an estimation remote sensing and GIS have been used especially in Europe and North America. We divided research trends in main countries into 4 categories such as remote sensing, GIS, geostatistics and environmental modeling considering spatial heterogeneity. The easiest way to apply is combination NFI data with forest type map based on GIS. Considering especially complicated forest structure of Korea, geostatistics is useful to estimate local variation of forest carbon. In addition, fine scale image is good for verification of forest carbon stocks and determination of CDM site. Related domestic researches are still on initial status and forest carbon stocks are mainly estimated using k-nearest neighbor(k-NN). In order to select suitable method for forest in Korea, an applicability of diverse spatial data and algorithm must be considered. Also the comparison between methods is required.

    Factors associated with tobacco and alcohol use (저소득층의 음주 및 흡연 관련 요인)

    • Choi, Eun-Jin;Kim, Chang-Woo
      • Korean Journal of Health Education and Promotion
      • /
      • v.25 no.5
      • /
      • pp.39-51
      • /
      • 2008
    • The objectives of this study were to analyze the socio-economical factors related to smoking and drinking behaviors using the Korea Welfare Panel data. The key variables were sex, age, frequency of health and medical facilities visit, subjective health level, smoking level, drinking level, depression symptoms, and low income level. Since the health variables in the Welfare Panel data were limited, the analysis was exploratory. In male population of those older than 30 years old, low income group people were more likely to smoke cigarettes than the general income population. In the result of the Chi square analysis, the smoking rate showed significantly different relationships with the different age groups, gender and income level. According to the descriptive analysis, persons with low income level were more likely to experience health risk behaviors and showed more medical service utilization. The utilization of the local public health centers was 4.6% for the Bow income level and 1% for the general level. The higher smoking rate was associated with the younger age, and the lower income. The smoking rate in the age category from 20 to 29 was 23.3% for the general level and 25% for the low income level. On the other hand, the drinking rate was even higher in the general families. The rates of non use of alcohol was 36.7% in the general families and 58.4% for the low income families. For both smoking and high risk drinking issues, demographic and sociological variables such as sex, age, education levels and income levels were analyzed, and there wer significant relationships. Health risk factors were serious for males, with age groups of 20's and 30's, lower education level, and in a low income family. In general, females were more unhealthy. The rates of smoking and drinking were higher in the low income level. Even in the health and nutrition survey results in 2005, persons in the low income class were experiencing poorer health in health level or the degree of action restriction. Since the effects of the health promotion could not be measured in a short period of time, it has not been easy to create the basis for the substantial effects. Factors related to health risks needs to be continuously studied using data from diverse field.

    Design Information Management System Core Development Using Industry Foundation Classes (IFC를 이용한 설계정보관리시스템 핵심부 구축)

    • Lee Keun-hyung;Chin Sang-yoon;Kim Jae-jun
      • Korean Journal of Construction Engineering and Management
      • /
      • v.1 no.2 s.2
      • /
      • pp.98-107
      • /
      • 2000
    • Increased use of computers in AEC (Architecture, Engineering and Construction) has expanded the amount of information gained from CAD (Computer Aided Design), PMIS (Project Management Information System), Structural Analysis Program, and Scheduling Program as well as making it more complex. And the productivity of AEC industry is largely dependent on well management and efficient reuse of this information. Accordingly, such trend incited much research and development on ITC (Information Technology in Construction) and CIC (Computer Integrated Construction) to be conducted. In exemplifying such effort, many researchers studied and researched on IFC (Industry Foundation Classes) since its development by IAI (International Alliance for Interoperability) for the product based information sharing. However, in spite of some valuable outputs, these researches are yet in the preliminary stage and deal mainly with conceptual ideas and trial implementations. Research on unveiling the process of the IFC application development, the core of the Design Information management system, and its applicable plan still need be done. Thus, the purpose of this paper is to determine the technologies needed for Design Information management system using IFC, and to present the key roles and the process of the IFC application development and its applicable plan. This system play a role to integrate the architectural information and the structural information into the product model and to group many each product items with various levels and aspects. To make the process model, we defined two activities, 'Product Modeling', 'Application Development', at the initial level. Then we decomposed the Application Development activity into five activities, 'IFC Schema Compile', 'Class Compile', 'Make Project Database Schema', 'Development of Product Frameworker', 'Make Project Database'. These activities are carried out by C++ Compiler, CAD, ObjectStore, ST-Developer, and ST-ObjectStore. Finally, we proposed the applicable process with six stages, '3D Modeling', 'Creation of Product Information', 'Creation and Update of Database', 'Reformation of Model's Structure with Multiple Hierarchies', 'Integration of Drawings and Specifications', and 'Creation of Quantity Information'. The IFCs, including the other classes which are going to be updated and developed newly on the construction, civil/structure, and facility management, will be used by the experts through the internet distribution technologies including CORBA and DCOM.

    • PDF

    Mouse Embryonic Stem Cell Uptakes of Buforin 2 and pEP-1 Conjugated with EGFP (생쥐 배아 줄기세포의 Buforin 2 및 pEP-1에 결합된 EGFP의 세포 내 수송)

    • Jung, Su-Hyun;Park, Seong-Soon;Lim, Hyun-Jung;Cheon, Yong-Pil
      • Development and Reproduction
      • /
      • v.11 no.2
      • /
      • pp.111-119
      • /
      • 2007
    • Differentiation of cells can be induced through modulation of endogenous regulators using exogenous factors. Useful transfection systems to transport a specific exogenous regulator into cell have been tried but still there are many obstacles to overcome. In this study, we examined the transfection efficiency of cell permeable peptides (CPPs) in mouse embryonic stem cell under the various conditions. To identify the CPP-mediated translocation of a protein, we employed recombinant CPP-enhanced green fluorescent protein (EGFP). Viability of R1 cells was different between experimental groups depending on the kind of CPP and the concentration of CPP-EGFP. Translocation of CPP-EGFPs into the R1 cells was not detected until 30 min after CPP-EGFPs treatment in all groups. After 1 hr, translocation of pEP-1-EGFP-N was detected, but it could not be detected in the other group. Transfection of pEP-1EGFP-N was independent on its concentration. The time course did not show saturation even after 24 hr in pEP-1-EGFP-N. These results showed that the permeability depended on the kind of CPP and the location of His-tag in the case of examined CPPs, and did not need biological energy. On summary, the efficiency of transfection of CPP-EGFP depends on the CPP sequences but the culture time is not a key factor in transfection for the mouse embryonic stem cell. For the future studies to improve the efficiency of translocation of protein into embryonic stem cells, it is needed to develop modified CPP or mediator. The studies would be very useful to induce the differentiation of embryonic stem cells.

    • PDF

    Effect of N-Methylacetamide Concentration on the Fertility and Hatchability of Cryopreserved Ogye Rooster Semen (N-Methylacetamide 동결 보호제의 농도가 오계 동결 정액의 수정 및 부화율에 미치는 영향)

    • Kim, Sung Woo;Choi, Jin Seok;Ko, Yeoung-Gyu;Do, Yoon-Jung;Byun, Mijeong;Park, Soo-Bong;Seong, Hwan-Hoo;Kim, Chong-Dae
      • Korean Journal of Poultry Science
      • /
      • v.41 no.1
      • /
      • pp.21-27
      • /
      • 2014
    • To preserve chicken genetic materials like cryopreserved spermatozoa, various kinds of freezing agents like glycerol, dimethylsuloxide, dimethylformamide or dimethylacetamide have been used for rooster semen preparation. Recently, the usage of N-methylacetamide (MA) for Ogye rooster semen preservation resulted in hatched chicken successfully. In this study, we investigated the effects of 7, 9 and 11% of MA on the viability, fertility and hatchability of frozen-thawed rooster semen using artificial insemination. The results of viability, fertility and hatchability in frozen semen with 7%, 9% or 11% MA were $35.16{\pm}6.12%$, $67.83{\pm}15.3%$ and $66.2{\pm}16.3%$ of motile sperm rate, 21.5%, 34.7% and 25% of fertility rate, and 100%, 89.5% and 87.5% of hatchability rate. The results of control group with frozen semen were 96.0% of fertility rate and 92.2% of hatchability rate. With these results, the concentration range of MA as a freezing agent of rooster semen could be 7~9% of media. The higher concentration of 9 % MA could decrease the fertility rate of thawed semen not the rate of hatchability rate. So the use of MA without affecting fertility rate would be a key point of freezing method of rooster semen for poultry genetic resource preservation.


    (34141) Korea Institute of Science and Technology Information, 245, Daehak-ro, Yuseong-gu, Daejeon
    Copyright (C) KISTI. All Rights Reserved.