• Title/Summary/Keyword: Topic Keywords

Search Result 379, Processing Time 0.024 seconds

Analysis of Massive Scholarly Keywords using Inverted-Index based Bottom-up Clustering (역인덱스 기반 상향식 군집화 기법을 이용한 대규모 학술 핵심어 분석)

  • Oh, Heung-Seon;Jung, Yuchul
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.11
    • /
    • pp.758-764
    • /
    • 2018
  • Digital documents such as patents, scholarly papers and research reports have author keywords which summarize the topics of documents. Different documents are likely to describe the same topic if they share the same keywords. Document clustering aims at clustering documents to similar topics with an unsupervised learning method. However, it is difficult to apply to a large amount of documents event though the document clustering is utilized to in various data analysis due to computational complexity. In this case, we can cluster and connect massive documents using keywords efficiently. Existing bottom-up hierarchical clustering requires huge computation and time complexity for clustering a large number of keywords. This paper proposes an inverted index based bottom-up clustering for keywords and analyzes the results of clustering with massive keywords extracted from scholarly papers and research reports.

Analyzing the Trend of False·Exaggerated Advertisement Keywords Using Text-mining Methodology (1990-2019) (텍스트마이닝 기법을 활용한 허위·과장광고 관련 기사의 트렌드 분석(1990-2019))

  • Kim, Do-Hee;Kim, Min-Jeong
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.4
    • /
    • pp.38-49
    • /
    • 2021
  • This study analyzed the trend of the term 'false and exaggerated advertisement' in 5,141 newspaper articles from 1990 to 2019 using text mining methodology. First of all, we identified the most frequent keywords of false and exaggerated advertisements through frequency analysis for all newspaper articles, and understood the context between the extracted keywords. Next, to examine how false and exaggerated advertisements have changed, the frequency analysis was performed by separating articles by 10 years, and the tendency of the keyword that became an issue was identified by comparing the number of academic papers on the subject of the highest keywords of each year. Finally, we identified trends in false and exaggerated advertisements based on the detailed keywords in the topic using the topic modeling. In our results, it was confirmed that the topic that became an issue at a specific time was extracted as the frequent keywords, and the keyword trends by period changed in connection with social and environmental factors. This study is meaningful in helping consumers spend wisely by cultivating background knowledge about unfair advertising. Furthermore, it is expected that the core keyword extraction will provide the true purpose of advertising and deliver its implications to companies and related employees who commit misconduct.

A study on research trends for gestational diabetes mellitus and breastfeeding: Focusing on text network analysis and topic modeling (임신성 당뇨와 모유수유에 대한 연구 동향 분석: 텍스트네트워크 분석과 토픽모델링 중심)

  • Lee, Junglim;Kim, Youngji;Kwak, Eunju;Park, Seungmi
    • The Journal of Korean Academic Society of Nursing Education
    • /
    • v.27 no.2
    • /
    • pp.175-185
    • /
    • 2021
  • Purpose: The aim of this study was to identify core keywords and topic groups in the 'Gestational diabetes mellitus (GDM) and Breastfeeding' field of research for better understanding research trends in the past 20 years. Methods: This was a text-mining and topic modeling study composed of four steps: 1) collecting abstracts, 2) extracting and cleaning semantic morphemes, 3) building a co-occurrence matrix, and 4) analyzing network features and clustering topic groups. Results: A total of 635 papers published between 2001 and 2020 were found in databases (Web of Science, CINAHL, RISS, DBPIA, RISS, KISS). Among them, 3,639 words extracted from 366 articles selected according to the conditions were analyzed by text network analysis and topic modeling. The most important keywords were 'exposure', 'fetus', 'hypoglycemia', 'prevention' and 'program'. Six topic groups were identified through topic modeling. The main topics of the study were 'cardiovascular disease' and 'obesity'. Through the topic modeling analysis, six themes were derived: 'cardiovascular disease', 'obesity', 'complication prevention strategy', 'support of breastfeeding', 'educational program' and 'management of GDM'. Conclusion: This study showed that over the past 20 years many studies have been conducted on complications such as cardiovascular diseases and obesity related to gestational diabetes and breastfeeding. In order to prevent complications of gestational diabetes and promote breastfeeding, various nursing interventions, including gestational diabetes management and educational programs for GDM pregnancies, should be developed in nursing fields.

Automatic Topic Identification Based on the Ontology for Web Documents (온톨로지 기반의 웹 문서 자동 주제 식별)

  • Choi In-Dae;Nam In-Gil;Bu Ki-Dong
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.9 no.3
    • /
    • pp.38-45
    • /
    • 2004
  • The goal of this research is to develop a method of identifying a topic of a given text by looking at relationship of keywords defined in an ontology hierarchy. The keywords which are extracted from important sentences of the given text are mapped onto their correspond concepts which exist in the hierarchy. After all the words are mapped, the correspond concepts will be generalized into one single concept. The single concept will most likely be the topic of text. Our research have an approach that promotes both satisfaction in term of robustness and accuracy using ontologies and word frequency. So, this attempts are done in what they call as a hybrid approach. We try to take the challenge by using knowledge-statistical base approach. Experimental results show that proposed method outperforms the existing method using knowledge-base only.

  • PDF

A Study on Research Trend for Nurses' Workplace Bullying in Korea: Focusing on Semantic Network Analysis and Topic Modeling (간호사의 직장 내 괴롭힘에 대한 국내 연구 동향 분석: 의미연결망분석과 토픽모델링 중심)

  • Choi, Jeong Sil;Kim, Youngji
    • Korean Journal of Occupational Health Nursing
    • /
    • v.28 no.4
    • /
    • pp.221-229
    • /
    • 2019
  • Purpose: The aim of this study was to identify core keywords and topic groups of workplace bullying researches in the past 10 years for better understanding research trend. Methods: The study was conducted in four steps: 1) collecting abstracts, 2) extracting and cleaning semantic morphemes, 3) building co-occurrence matrix and 4) analyzing network features and clustering topic groups. Results: 437 articles between 2010 and 2019 were retrieved from 5 databases (RISS, NDSL, Google scholar, DBPIA and Kyobo Scholar). Forty-one abstracts from these articles were extracted, and network analysis was conducted using semantic network module. The most important core keywords were 'turnover', 'intention', 'factor', 'program' and 'nursing'. Four topic groups were identified from Korean databases. Major topics were 'turnover' and 'organization culture'. Conclusion: After reviewing previous research, it has been found that turnover intention has been emphasized. Further research focused on various intervention is needed to relieve workplace bullying in nursing field.

Research trends over 10 years (2010-2021) in infant and toddler rearing behavior by family caregivers in South Korea: text network and topic modeling

  • In-Hye Song;Kyung-Ah Kang
    • Child Health Nursing Research
    • /
    • v.29 no.3
    • /
    • pp.182-194
    • /
    • 2023
  • Purpose: This study analyzed research trends in infant and toddler rearing behavior among family caregivers over a 10-year period (2010-2021). Methods: Text network analysis and topic modeling were employed on data collected from relevant papers, following the extraction and refinement of semantic morphemes. A semantic-centered network was constructed by extracting words from 2,613 English-language abstracts. Data analysis was performed using NetMiner 4.5.0. Results: Frequency analysis, degree centrality, and eigenvector centrality all revealed the terms ''scale," ''program," and ''education" among the top 10 keywords associated with infant and toddler rearing behaviors among family caregivers. The keywords extracted from the analysis were divided into two clusters through cohesion analysis. Additionally, they were classified into two topic groups using topic modeling: "program and evaluation" (64.37%) and "caregivers' role and competency in child development" (35.63%). Conclusion: The roles and competencies of family caregivers are essential for the development of infants and toddlers. Intervention programs and evaluations are necessary to improve rearing behaviors. Future research should determine the role of nurses in supporting family caregivers. Additionally, it should facilitate the development of nursing strategies and intervention programs to promote positive rearing practices.

Analysis of Shipping and Logistics News Articles using Topic Modeling (토픽모델링을 활용한 해운물류 뉴스 분석)

  • Hee-Young Yoon;Il-Youp Kwak
    • Korea Trade Review
    • /
    • v.46 no.4
    • /
    • pp.61-76
    • /
    • 2021
  • This study focuses on three logistics-related news (Logistics Newspaper, Korea Shipping Gadget, and Korea Shipping Newspaper) in order to present changes in logistics issues, centering on Corona 19, which has recently had the greatest impact in the world. For data collection, two-year news articles in 2019 and 2020 (title, article, content, date, article classification, article URL) were collected through web crawling (using Python's BeautifulSoup, requests module) on the homepages of three representative logistics-related media companies. As for the data analysis methods, fundamental statistical analysis, Latent Dirichlet Allocation (LDA) for topic modeling, and Scattertext were performed. The analysis results were as follows. First, among the three news media related to logistics, the Korea Shipping Newspaper was carrying out the most active media activities. Second, through topic modeling with LDA, eight logistics-related topics were identified, and keywords and significant issues of each topic were presented. Third, the keywords were visually expressed through Scattertext. This is the first study to present changes in the logistics field, focusing on articles from representative logistics-related media in 2019 and 2020. In particular, 2019 and 2020 can be divided into before and after the outbreak of Corona 19, which has had a great impact not only on the logistics field but also on our lives as a whole. For future work, a multi-faceted approach is required, such as comparative studies of logistics issues between countries or presenting implications based on long-term time-series articles.

Simulation Nursing Education Research Topics Trends Using Text Network Analysis (텍스트네트워크분석을 적용하여 탐색한 국내 시뮬레이션간호교육 연구주제 동향)

  • Park, Chan Sook
    • Journal of East-West Nursing Research
    • /
    • v.26 no.2
    • /
    • pp.118-129
    • /
    • 2020
  • Purpose: The purpose of this study was to analyze the topic trend of domestic simulation nursing education research using text network analysis(TNA). Methods: This study was conducted in four steps. TNA was performed using the NetMiner (version 4.4.1) program. Firstly, 245 articles from 4 databases (RISS, KCI, KISS, DBpia) published from 2008 to 2018, were collected. Secondly, keyword-forms were unified and representative words were selected. Thirdly, co-occurrence matrices of keywords with a frequency of 2 or higher were generated. Finally, social network-related measures-indices of degree centrality and betweenness centrality-were obtained. The topic trend over time was visualized as a sociogram and presented. Results: 178 author keywords were extracted. Keywords with high degree centrality were "Nursing student", "Clinical competency", "Knowledge", "Critical thinking", "Communication", and "Problem-solving ability." Keywords with high betweenness centrality were "CPR", "Knowledge", "Attitude", "Self-efficacy", "Performance ability", and "Nurse." Over time, the topic trends on simulation nursing education have diversified. For example, topics such as "Neonatal nursing", "Obstetric nursing", "Pediatric nursing", "Blood transfusion", "Community visit nursing", and "Core basic nursing skill" appeared. The core-topics that emerged only recently (2017-2018) were "High-fidelity", "Heart arrest", "Clinical judgment", "Reflection", "Core basic nursing skill." Conclusion: Although simulation nursing education research has been increasing, it is necessary to continue studies on integrated simulation learning designs based on various nursing settings. Additionally, in simulation nursing education, research is required not only on learner-centered educational outcomes, but also factors that influence educational outcomes from the perspective of the instructors.

Reorganizing Social Issues from R&D Perspective Using Social Network Analysis

  • Shun Wong, William Xiu;Kim, Namgyu
    • Journal of Information Technology Applications and Management
    • /
    • v.22 no.3
    • /
    • pp.83-103
    • /
    • 2015
  • The rapid development of internet technologies and social media over the last few years has generated a huge amount of unstructured text data, which contains a great deal of valuable information and issues. Therefore, text mining-extracting meaningful information from unstructured text data-has gained attention from many researchers in various fields. Topic analysis is a text mining application that is used to determine the main issues in a large volume of text documents. However, it is difficult to identify related issues or meaningful insights as the number of issues derived through topic analysis is too large. Furthermore, traditional issue-clustering methods can only be performed based on the co-occurrence frequency of issue keywords in many documents. Therefore, an association between issues that have a low co-occurrence frequency cannot be recognized using traditional issue-clustering methods, even if those issues are strongly related in other perspectives. Therefore, in this research, a methodology to reorganize social issues from a research and development (R&D) perspective using social network analysis is proposed. Using an R&D perspective lexicon, issues that consistently share the same R&D keywords can be further identified through social network analysis. In this study, the R&D keywords that are associated with a particular issue imply the key technology elements that are needed to solve a particular issue. Issue clustering can then be performed based on the analysis results. Furthermore, the relationship between issues that share the same R&D keywords can be reorganized more systematically, by grouping them into clusters according to the R&D perspective lexicon. We expect that our methodology will contribute to establishing efficient R&D investment policies at the national level by enhancing the reusability of R&D knowledge, based on issue clustering using the R&D perspective lexicon. In addition, business companies could also utilize the results by aligning the R&D with their business strategy plans, to help companies develop innovative products and new technologies that sustain innovative business models.

Topic Modeling and Keyword Network Analysis of News Articles Related to Nurses before and after "the Thanks to You Challenge" during the COVID-19 Pandemic (COVID-19 '덕분에 챌린지' 전후 간호사 관련 뉴스 기사의 토픽 모델링 및 키워드 네트워크 분석)

  • Yun, Eun Kyoung;Kim, Jung Ok;Byun, Hye Min;Lee, Guk Geun
    • Journal of Korean Academy of Nursing
    • /
    • v.51 no.4
    • /
    • pp.442-453
    • /
    • 2021
  • Purpose: This study was conducted to assess public awareness and policy challenges faced by practicing nurses. Methods: After collecting nurse-related news articles published before and after 'the Thanks to You Challenge' campaign (between December 31, 2019, and July 15, 2020), keywords were extracted via preprocessing. A three-step method keyword analysis, latent Dirichlet allocation topic modeling, and keyword network analysis was used to examine the text and the structure of the selected news articles. Results: Top 30 keywords with similar occurrences were collected before and after the campaign. The five dominant topics before the campaign were: pandemic, infection of medical staff, local transmission, medical resources, and return of overseas Koreans. After the campaign, the topics 'infection of medical staff' and 'return of overseas Koreans' disappeared, but 'the Thanks to You Challenge' emerged as a dominant topic. A keyword network analysis revealed that the word of nurse was linked with keywords like thanks and campaign, through the word of sacrifice. These words formed interrelated domains of 'the Thanks to You Challenge' topic. Conclusion: The findings of this study can provide useful information for understanding various issues and social perspectives on COVID-19 nursing. The major themes of news reports lagged behind the real problems faced by nurses in COVID-19 crisis. While the press tends to focus on heroism and whole society, issues and policies mutually beneficial to public and nursing need to be further explored and enhanced by nurses.