• Title/Summary/Keyword: NetMiner4

Search Result 68, Processing Time 0.018 seconds

User-Perspective Issue Clustering Using Multi-Layered Two-Mode Network Analysis (다계층 이원 네트워크를 활용한 사용자 관점의 이슈 클러스터링)

  • Kim, Jieun;Kim, Namgyu;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.93-107
    • /
    • 2014
  • In this paper, we report what we have observed with regard to user-perspective issue clustering based on multi-layered two-mode network analysis. This work is significant in the context of data collection by companies about customer needs. Most companies have failed to uncover such needs for products or services properly in terms of demographic data such as age, income levels, and purchase history. Because of excessive reliance on limited internal data, most recommendation systems do not provide decision makers with appropriate business information for current business circumstances. However, part of the problem is the increasing regulation of personal data gathering and privacy. This makes demographic or transaction data collection more difficult, and is a significant hurdle for traditional recommendation approaches because these systems demand a great deal of personal data or transaction logs. Our motivation for presenting this paper to academia is our strong belief, and evidence, that most customers' requirements for products can be effectively and efficiently analyzed from unstructured textual data such as Internet news text. In order to derive users' requirements from textual data obtained online, the proposed approach in this paper attempts to construct double two-mode networks, such as a user-news network and news-issue network, and to integrate these into one quasi-network as the input for issue clustering. One of the contributions of this research is the development of a methodology utilizing enormous amounts of unstructured textual data for user-oriented issue clustering by leveraging existing text mining and social network analysis. In order to build multi-layered two-mode networks of news logs, we need some tools such as text mining and topic analysis. We used not only SAS Enterprise Miner 12.1, which provides a text miner module and cluster module for textual data analysis, but also NetMiner 4 for network visualization and analysis. Our approach for user-perspective issue clustering is composed of six main phases: crawling, topic analysis, access pattern analysis, network merging, network conversion, and clustering. In the first phase, we collect visit logs for news sites by crawler. After gathering unstructured news article data, the topic analysis phase extracts issues from each news article in order to build an article-news network. For simplicity, 100 topics are extracted from 13,652 articles. In the third phase, a user-article network is constructed with access patterns derived from web transaction logs. The double two-mode networks are then merged into a quasi-network of user-issue. Finally, in the user-oriented issue-clustering phase, we classify issues through structural equivalence, and compare these with the clustering results from statistical tools and network analysis. An experiment with a large dataset was performed to build a multi-layer two-mode network. After that, we compared the results of issue clustering from SAS with that of network analysis. The experimental dataset was from a web site ranking site, and the biggest portal site in Korea. The sample dataset contains 150 million transaction logs and 13,652 news articles of 5,000 panels over one year. User-article and article-issue networks are constructed and merged into a user-issue quasi-network using Netminer. Our issue-clustering results applied the Partitioning Around Medoids (PAM) algorithm and Multidimensional Scaling (MDS), and are consistent with the results from SAS clustering. In spite of extensive efforts to provide user information with recommendation systems, most projects are successful only when companies have sufficient data about users and transactions. Our proposed methodology, user-perspective issue clustering, can provide practical support to decision-making in companies because it enhances user-related data from unstructured textual data. To overcome the problem of insufficient data from traditional approaches, our methodology infers customers' real interests by utilizing web transaction logs. In addition, we suggest topic analysis and issue clustering as a practical means of issue identification.

A Study on the School Library Research Trends Using Topic Modeling (토픽모델링을 활용한 학교도서관 연구동향 분석)

  • Jung, Young-Joo;Kim, Hea-Jin
    • Journal of Korean Library and Information Science Society
    • /
    • v.51 no.3
    • /
    • pp.103-121
    • /
    • 2020
  • This study aimed to analyze the research trends of school libraries from 1990 to July 2020. To this end, LDA topic modeling analysis was conducted to the domestic article abstracts related to school libraries. The total number of documents is 498 papers published by the four major domestic journals in Library and Information Science. The log-likelihood estimate criterion was used to determine the number of topics for topic modeling. As a result of the study, 27 topics were discovered, then, theory were categorized by eight subject areas: general, institutional system, building/equipment, operation/management, data organization, service, education, and others. The most popular research was library utilization classes (T27) and Information Utilization (T2). More than 20 studies were found in each evaluation index development (T13), school librarian placement (T24), learning information media utilization (T3), community public library (T7), library cooperation (T9), library use (T17), library research (T11), reading education (T4), collection development (T5), and education effects/teaching methods (T18).

Features of Science Classes in Science Core Schools Identified through Semantic Network Analysis (언어네트워크분석을 통해 본 과학중점학교 과학수업의 특징)

  • Kim, Jinhee;Na, Jiyeon;Song, Jinwoong
    • Journal of The Korean Association For Science Education
    • /
    • v.38 no.4
    • /
    • pp.565-574
    • /
    • 2018
  • The purpose of this study is to investigate the features of science classes of Science Core Schools (SCSs) perceived by students. 654 students from 14 SCSs were surveyed with two open-ended questions on the features of science classes. The students' responses were analyzed with NetMiner 4.5, in terms of the centrality (of betweenness and of degree) analysis and the community analysis. The results of the research are as follows: (1) the science classes of SCSs were perceived by students to be of the environment of free questioning, active participation and communication, caring teacher, more science experiments and advanced contents, and knowledge sharing; (2) science classes in SCSs were perceived to be different from those of ordinary high schools because SCSs provide more opportunities for science-related special courses (like project work, advanced science subjects), extra-curricular activities, inquiry and research activities, school supports, hard-working classroom environment, longer studying hours, R&E and club activities. The students' perceptions of SCS science classes appear to be in line with the characteristics of 'good' science lessons from previous studies. The SCS project itself and the features of SCS science classes would help us to see how we introduce educational innovations into actual schools.

A Comparative Analysis Study of IFLA School Library Guidelines Using Semantic Network Analysis (언어 네트워크 분석을 통한 IFLA의 학교도서관 가이드라인 비교·분석에 관한 연구)

  • Lee, Byeong-Kee
    • Journal of Korean Library and Information Science Society
    • /
    • v.51 no.2
    • /
    • pp.1-21
    • /
    • 2020
  • The purpose of this study is to explore semantic characteristics of IFLA school library guidelines through network analysis. There are two versions, 2002 edition and 2015 revision of the guidelines. This study analyzed the 2002 edition and 2015 revision of the IFLA school library guidelines view point of semantic network, and compared characteristics of two versions. The keywords were to extracted from two texts, semantic network were composed based on co-occurrence relations with keywords. The centrality(degree centrality, closeness centrality, betweenness centrality) was analyzed from the network. In addition, this study conducted topic modeling analysis using LDA function of NetMiner4.0. The result of this study is following these. First, When comparing the centrality, the 'Program, Teaching, Reading, Inquiry, Literacy, Media' keyword was higher in the 2015 revision than in the 2002 edition. Second, 'Inquiry' in degree centrality and 'Achievement' in closeness centrality which were not included in the 2002 edition top-ranked keyword list, have new appeared in 2015 revision. third, As a result of the analysis of topic modeling, compared to the 2002 version, the importance of topics on programs and services, teaching and learning activities of librarian teacher, and media and information literacy is increasing in the 2015 revision.

Social Determinants of Health of Multicultural Adolescents in South Korea: An Integrated Literature Review (2018~2020) (국내 다문화 청소년의 사회적 건강결정요인: 통합적 문헌고찰(2018~2020))

  • Kim, Youlim;Lee, Hyeonkyeong;Lee, Hyeyeon;Lee, Mikyung;Kim, Sookyung;Kennedy, Diema Konlan
    • Research in Community and Public Health Nursing
    • /
    • v.32 no.4
    • /
    • pp.430-444
    • /
    • 2021
  • Purpose: This study is an integrated literature review to analyze health problems and social determinants of multicultural adolescents in South Korea. Methods: An integrative review was conducted according to Whittemore & Knafl's guideline. An electronic search that included publications from 2018 to 2020 in the PubMed, EMBASE, Cochrane Library, CINAHL, RISS, and KISS databases was conducted. Of a total of 67 records that were identified, 13 finally met full inclusion criteria. Text network analysis was also conducted to identify keywords network trends using NetMiner program. Results: The health problems of multicultural adolescents were classified into mental health (depression, anxiety, suicide and acculturative stress) and health risk behaviors (smoking, risky drinking, smartphone dependence and sexual behavior). As social determinants affecting the health of multicultural adolescents, the biological factors such as gender, age, and visible minority, and the psychological factors such as acculturative stress, self-esteem, family support, and ego-resiliency were identified. The sociocultural factors were identified as family economic status, residential area, parental education level, and parents' country of birth. As a result of text network analysis, a total of 41 words were identified. Conclusion: Based on these results, mental health and health risk behaviors should be considered as interventions for health promotion of multicultural adolescents. Our findings suggest that further research should be conducted to broaden the scope of health determinants to account for the effects of the physical environment and health care system.

Research Trends in Global Cruise Industry Using Keyword Network Analysis (키워드 네트워크 분석을 활용한 세계 크루즈산업 연구동향)

  • Jhang, Se-Eun;Lee, Su-Ho
    • Journal of Navigation and Port Research
    • /
    • v.38 no.6
    • /
    • pp.607-614
    • /
    • 2014
  • This article aims to explore and discuss research trends in global cruise industry using keyword network analysis. We visualize keyword networks in each of four groups of 1982-1999, 2000-2004, 2005-2009, 2010-2014 based on the top 20 keyword nodes' degree centrality and betweenness centrality which are selected among four centrality measurements, comparing them with frequency order. The article shows that keyword frequency collected from 240 articles published in international journals is subject to Zipf's law and nodes degree distribution also exhibits power law. We try to find out research trends in global cruise industry to change some important keywords diachronically, visualizing several networks focusing on the top two keywords, cruise and tourism, belonging to all the four year groups, with high degree and betweenness centrality values. Interestingly enough, a new node, China, connecting the top most keywords, appears in the most recent period of 2010-2014 when China has emerged as one of the rapid development countries in global cruise industry. Therefore keyword network analysis used in this article will be useful to understand research trends in global cruise industry because of increase and decrease of numbers of network types in different year groups and the visual connection between important nodes in giant components.

Analysis of Journal of Dental Hygiene Science Research Trends Using Keyword Network Analysis (키워드 네트워크 분석을 활용한 치위생과학회지 연구동향 분석)

  • Kang, Yong-Ju;Yoon, Sun-Joo;Moon, Kyung-Hui
    • Journal of dental hygiene science
    • /
    • v.18 no.6
    • /
    • pp.380-388
    • /
    • 2018
  • This research team extracted keywords from 953 papers published in the Journal of Dental Hygiene Science from 2001 to 2018 for keyword and centrality analyses using the Keyword Network Analysis method. Data were analyzed using Excel 2016 and NetMiner Version 4.4.1. By conducting a deeper analysis between keywords by overall keyword and time frame, we arrived at the following conclusions. For the 17 years considered for this study, the most frequently used words in a dental science paper were "Health," "Oral," "Hygiene," and "Hygienist." The words that form the center by connecting major words in the Journal of Dental Hygiene through the upper-degree centrality words were "Health," "Dental," "Oral," "Hygiene," and "Hygienist." The upper betweenness centrality words were "Dental," "Health," "Oral," "Hygiene," and "Student." Analysis results of the degree centrality words per period revealed "Health" (0.227), "Dental" (0.136), and "Hygiene" (0.136) for period 1; "Health" (0.242), "Dental" (0.177), and "Hygiene" (0.113) for period 2; "Health" (0.200), "Dental" (0.176), and "Oral" (0.082) for period 3; and "Dental" (0.235), "Health" (0.206), and "Oral" (0.147) for period 4. Analysis results of the betweenness centrality words per period revealed "Oral" (0.281) and "Health" (0.199) for period 1; "Dental" (0.205) and "Health" (0.169) for period 2, with the weight then dispersing to "Hygiene" (0.112), "Hygienist" (0.054), and "Oral" (0.053); "Health" (0.258) and "Dental" (0.246) for period 3; and "Oral" (0.364), "Health" (0.353), and "Dental" (0.333) for period 4. Based on the above results, we hope that further studies will be conducted in the future with diverse study subjects.

Predicting the Performance of Recommender Systems through Social Network Analysis and Artificial Neural Network (사회연결망분석과 인공신경망을 이용한 추천시스템 성능 예측)

  • Cho, Yoon-Ho;Kim, In-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.4
    • /
    • pp.159-172
    • /
    • 2010
  • The recommender system is one of the possible solutions to assist customers in finding the items they would like to purchase. To date, a variety of recommendation techniques have been developed. One of the most successful recommendation techniques is Collaborative Filtering (CF) that has been used in a number of different applications such as recommending Web pages, movies, music, articles and products. CF identifies customers whose tastes are similar to those of a given customer, and recommends items those customers have liked in the past. Numerous CF algorithms have been developed to increase the performance of recommender systems. Broadly, there are memory-based CF algorithms, model-based CF algorithms, and hybrid CF algorithms which combine CF with content-based techniques or other recommender systems. While many researchers have focused their efforts in improving CF performance, the theoretical justification of CF algorithms is lacking. That is, we do not know many things about how CF is done. Furthermore, the relative performances of CF algorithms are known to be domain and data dependent. It is very time-consuming and expensive to implement and launce a CF recommender system, and also the system unsuited for the given domain provides customers with poor quality recommendations that make them easily annoyed. Therefore, predicting the performances of CF algorithms in advance is practically important and needed. In this study, we propose an efficient approach to predict the performance of CF. Social Network Analysis (SNA) and Artificial Neural Network (ANN) are applied to develop our prediction model. CF can be modeled as a social network in which customers are nodes and purchase relationships between customers are links. SNA facilitates an exploration of the topological properties of the network structure that are implicit in data for CF recommendations. An ANN model is developed through an analysis of network topology, such as network density, inclusiveness, clustering coefficient, network centralization, and Krackhardt's efficiency. While network density, expressed as a proportion of the maximum possible number of links, captures the density of the whole network, the clustering coefficient captures the degree to which the overall network contains localized pockets of dense connectivity. Inclusiveness refers to the number of nodes which are included within the various connected parts of the social network. Centralization reflects the extent to which connections are concentrated in a small number of nodes rather than distributed equally among all nodes. Krackhardt's efficiency characterizes how dense the social network is beyond that barely needed to keep the social group even indirectly connected to one another. We use these social network measures as input variables of the ANN model. As an output variable, we use the recommendation accuracy measured by F1-measure. In order to evaluate the effectiveness of the ANN model, sales transaction data from H department store, one of the well-known department stores in Korea, was used. Total 396 experimental samples were gathered, and we used 40%, 40%, and 20% of them, for training, test, and validation, respectively. The 5-fold cross validation was also conducted to enhance the reliability of our experiments. The input variable measuring process consists of following three steps; analysis of customer similarities, construction of a social network, and analysis of social network patterns. We used Net Miner 3 and UCINET 6.0 for SNA, and Clementine 11.1 for ANN modeling. The experiments reported that the ANN model has 92.61% estimated accuracy and 0.0049 RMSE. Thus, we can know that our prediction model helps decide whether CF is useful for a given application with certain data characteristics.