• Title/Summary/Keyword: Topic Keywords

Search Result 379, Processing Time 0.025 seconds

A Text Mining Study on Endangered Wildlife Complaints - Discovery of Key Issues through LDA Topic Modeling and Network Analysis - (멸종위기 야생생물 민원 텍스트 마이닝 연구 - LDA 토픽 모델링과 네트워크 분석을 통한 주요 이슈 발굴 -)

  • Kim, Na-Yeong;Nam, Hee-Jung;Park, Yong-Su
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • v.26 no.6
    • /
    • pp.205-220
    • /
    • 2023
  • This study aimed to analyze the needs and interests of the public on endangered wildlife using complaint big data. We collected 1,203 complaints and their corresponding text data on endangered wildlife, pre-processed them, and constructed a document-term matrix for 1,739 text data. We performed LDA (Latent Dirichlet Allocation) topic modeling and network analysis. The results revealed that the complaints on endangered wildlife peaked in June-August, and the interest shifted from insects to various endangered wildlife in the living area, such as mammals, birds, and amphibians. In addition, the complaints on endangered wildlife could be categorized into 8 topics and 5 clusters, such as discovery report, habitat protection and response request, information inquiry, investigation and action request, and consultation request. The co-occurrence network analysis for each topic showed that the keywords reflecting the call center reporting procedure, such as photo, send, and take, had high centrality in common, and other keywords such as dung beetle, know, absence and think played an important role in the network. Through this analysis, we identified the main keywords and their relationships within each topic and derived the main issues for each topic. This study confirmed the increasing and diversifying public interest and complaints on endangered wildlife and highlighted the need for professional response. We also suggested developing and extending participatory conservation plans that align with the public's preferences and demands. This study demonstrated the feasibility of using complaint big data on endangered wildlife and its implications for policy decision-making and public promotion on endangered wildlife.

Analysis of Research Trends in Tax Compliance using Topic Modeling (토픽모델링을 활용한 조세순응 연구 동향 분석)

  • Kang, Min-Jo;Baek, Pyoung-Gu
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.1
    • /
    • pp.99-115
    • /
    • 2022
  • In this study, domestic academic journal papers on tax compliance, tax consciousness, and faithful tax payment (hereinafter referred to as "tax compliance") were comprehensively analyzed from an interdisciplinary perspective as a representative research topic in the field of tax science. To achieve the research purpose, topic modeling technique was applied as part of text mining. In the flow of data collection-keyword preprocessing-topic model analysis, potential research topics were presented from tax compliance related keywords registered by the researcher in a total of 347 papers. The results of this study can be summarized as follows. First, in the keyword analysis, keywords such as tax investigation, tax avoidance, and honest tax reporting system were included in the top 5 keywords based on simple term-frequency, and in the TF-IDF value considering the relative importance of keywords, they were also included in the top 5 keywords. On the other hand, the keyword, tax evasion, was included in the top keyword based on the TF-IDF value, whereas it was not highlighted in the simple term-frequency. Second, eight potential research topics were derived through topic modeling. The topics covered are (1) tax fairness and suppression of tax offenses, (2) the ideology of the tax law and the validity of tax policies, (3) the principle of substance over form and guarantee of tax receivables (4) tax compliance costs and tax administration services, (5) the tax returns self- assessment system and tax experts, (6) tax climate and strategic tax behavior, (7) multifaceted tax behavior and differential compliance intentions, (8) tax information system and tax resource management. The research comprehensively looked at the various perspectives on the tax compliance from an interdisciplinary perspective, thereby comprehensively grasping past research trends on tax compliance and suggesting the direction of future research.

Exploration of Research Trends in The Journal of Distribution Science Using Keyword Analysis

  • YANG, Woo-Ryeong
    • The Journal of Industrial Distribution & Business
    • /
    • v.10 no.8
    • /
    • pp.17-24
    • /
    • 2019
  • Purpose - The purpose of this study is to find out research directions for distribution and fusion and complex field to many domestic and foreign researchers carrying out related academic research by confirming research trends in the Journal of Distribution Science (JDS). Research Design, Data, and Methodology - To do this, I used keywords from a total of 904 papers published in the JDS excluding 19 papers that were not presented with keywords among 923. The analysis utilized word clouding, topic modeling, and weighted frequency analysis using the R program. Results - As a result of word clouding analysis, customer satisfaction was the most utilized keyword. Topic modeling results were divided into ten topics such as distribution channels, communication, supply chain, brand, business, customer, comparative study, performance, KODISA journal, and trade. It is confirmed that only the service quality part is increased in the weighted frequency analysis result of applying to the year group. Conclusion - The results of this study confirm that the JDS has developed into various convergence and integration researches from the past studies limited to the field of distribution. However, JDS's identity is based on distribution. Therefore, it is also necessary to establish identity continuously through special editions of fields related to distribution.

Analysis of Media Articles on COVID-19 and Nurses Using Text Mining and Topic Modeling (텍스트 마이닝과 토픽모델링 분석을 활용한 코로나19와 간호사에 대한 언론기사 분석)

  • An, Jiyeon;Yi, Yunjeong;Lee, Bokim
    • Research in Community and Public Health Nursing
    • /
    • v.32 no.4
    • /
    • pp.467-476
    • /
    • 2021
  • Purpose: The purpose of this study is to understand the social perceptions of nurses in the context of the COVID-19 outbreak through analysis of media articles. Methods: Among the media articles reported from January 1st to September 30th, 2020, those containing the keywords '[corona or Wuhan pneumonia or covid] and [nurse or nursing]' are extracted. After the selection process, the text mining and topic modeling are performed on 454 media articles using textom version 4.5. Results: Frequency Top 30 keywords include 'Nurse', 'Corona', 'Isolation', 'Support', 'Shortage', 'Protective Clothing', and so on. Keywords that ranked high in Term Frequency-Inverse Document Frequency (TF-IDF) values are 'Daegu', 'President', 'Gwangju', 'manpower', and so on. As a result of the topic analysis, 10 topics are derived, such as 'Local infection', 'Dispatch of personnel', 'Message for thanks', and 'Delivery of one's heart'. Conclusion: Nurses are both the contributors and victims of COVID-19 prevention. The government and the nurses' community should make efforts to improve poor working conditions and manpower shortages.

Comparison of Topic Modeling Methods for Analyzing Research Trends of Archives Management in Korea: focused on LDA and HDP (국내 기록관리학 연구동향 분석을 위한 토픽모델링 기법 비교 - LDA와 HDP를 중심으로 -)

  • Park, JunHyeong;Oh, Hyo-Jung
    • Journal of Korean Library and Information Science Society
    • /
    • v.48 no.4
    • /
    • pp.235-258
    • /
    • 2017
  • The purpose of this study is to analyze research trends of archives management in Korea by comparing LDA (Latent Semantic Allocation) topic modeling, which is the most famous method in text mining, and HDP (Hierarchical Dirichlet Process) topic modeling, which is developed LDA topic modeling. Firstly we collected 1,027 articles related to archives management from 1997 to 2016 in two journals related with archives management and four journals related with library and information science in Korea and performed several preprocessing steps. And then we conducted LDA and HDP topic modelings. For a more in-depth comparison analysis, we utilized LDAvis as a topic modeling visualization tool. At the results, LDA topic modeling was influenced by frequently keywords in all topics, whereas, HDP topic modeling showed specific keywords to easily identify the characteristics of each topic.

Curriculum Relevance Analysis of Physics Book Report Text Using Topic Modeling (토픽모델링을 활용한 물리학 독서감상문 텍스트의 교육과정 연계성 분석)

  • Lim, Jeong-Hoon
    • Journal of Korean Library and Information Science Society
    • /
    • v.53 no.2
    • /
    • pp.333-353
    • /
    • 2022
  • This study analyzed the relevance of the curriculum by applying topic modeling to book reports written as content area reading activities in the 'physics' class. In order to carry out the research, 332 physics book reports were collected to analyze the relevance among keywords and topics were extracted using STM. The result of the analysis showed that the main keywords of the physics book reports were 'thought', 'content', 'explain', 'theory', 'person', 'understanding'. To examine the influence and connection relationship of the derived keywords, the study presented degree centrality, between centrality, and eigenvetor centrality. As a result of the topic modeling analysis, eleven topics related to the physics curriculum were extracted, and the curriculum linkage could be drawn in three subjects (Physics I, Physics II, Science History), and six areas (force and motion, modern physics, wave, heat and energy, Western science history, and What is science). The analyzed results can be used as evidence for a more systematic implementation of content area reading activities which reflect the subject characteristics in the future.

Patent Technology Trends of Oral Health: Application of Text Mining

  • Hee-Kyeong Bak;Yong-Hwan Kim;Han-Na Kim
    • Journal of dental hygiene science
    • /
    • v.24 no.1
    • /
    • pp.9-21
    • /
    • 2024
  • Background: The purpose of this study was to utilize text network analysis and topic modeling to identify interconnected relationships among keywords present in patent information related to oral health, and subsequently extract latent topics and visualize them. By examining key keywords and specific subjects, this study sought to comprehend the technological trends in oral health-related innovations. Furthermore, it aims to serve as foundational material, suggesting directions for technological advancement in dentistry and dental hygiene. Methods: The data utilized in this study consisted of information registered over a 20-year period until July 31st, 2023, obtained from the patent information retrieval service, KIPRIS. A total of 6,865 patent titles related to keywords, such as "dentistry," "teeth," and "oral health," were collected through the searches. The research tools included a custom-designed program coded specifically for the research objectives based on Python 3.10. This program was used for keyword frequency analysis, semantic network analysis, and implementation of Latent Dirichlet Allocation for topic modeling. Results: Upon analyzing the centrality of connections among the top 50 frequently occurring words, "method," "tooth," and "manufacturing" displayed the highest centrality, while "active ingredient" had the lowest. Regarding topic modeling outcomes, the "implant" topic constituted the largest share at 22.0%, while topics concerning "devices and materials for oral health" and "toothbrushes and oral care" exhibited the lowest proportions at 5.5% each. Conclusion: Technologies concerning methods and implants are continually being researched in patents related to oral health, while there is comparatively less technological development in devices and materials for oral health. This study is expected to be a valuable resource for uncovering potential themes from a large volume of patent titles and suggesting research directions.

Factors affecting the number of citations in papers published in the Journal of Korean Society of Dental Hygiene (한국치위생학회지 게재논문의 피인용수에 영향을 미친 요인)

  • Jeon, Se-Jeong
    • Journal of Korean society of Dental Hygiene
    • /
    • v.21 no.5
    • /
    • pp.639-644
    • /
    • 2021
  • Objectives: The purpose of this study was to analyze the factors that affected the number of citations for articles published in the Journal of Korean Society of Dental Hygiene based on previous studies. Methods: Information on papers including the number of citations was collected using a web crawling technique. The effect of the number of author keywords, the number of Medical Subject Headings (MeSH) keywords, MeSH match rate, abstract word count and keyword-abstract ratio on the number of citations was analyzed by multiple regression analysis. Results: The use of the MeSH keyword did not have a significant effect on the number of citations. Among the other factors, only the keyword-abstract ratio was statistically significant. Conclusions: Select a topic of constant interest in the field, write the title in detail using colons or asterisks if necessary, and do not repeat the words used in the title in keywords. Select specific keywords deeply related to the topic. In particular, choice words or phrases that are frequently used in the abstract. If the MeSH keyword selection contradicts the previous strategies, boldly give up the MeSH keyword.

Comparison of Topics Related to Nurse on the Internet Portals and Social Media Before and During the COVID-19 era Using Topic Modeling (토픽 모델링을 활용한 COVID-19 발생 전후 간호사 관련 토픽 비교: 인터넷 포털과 소셜미디어를 중심으로)

  • Yoon, Young Mi;Kim, Seong Kwang;Kim, Hye Kyeong;Kim, Eun Joo;Jeong, Yuneui
    • Journal of muscle and joint health
    • /
    • v.27 no.3
    • /
    • pp.255-267
    • /
    • 2020
  • Purpose: The purpose of this study is to compare topics through keywords related to nurses in internet portals and social media Pre coronavirus disease (COVID-19) era and during the COVID-19 era. Methods: For six months before and during the outbreak of COVID-19 in Korea, "nurse" was searched on the internet. For data collection, we implemented web crawlers in programming languages such as Python and collected keywords. The keywords collected were classified into three domains of topic Modeling. Results: The keyword 'nurse' increased by 15% during COVID-19 era. Keywords that ranked high in Term Frequency - Inverse Document Frequency (TF-IDF) values were before COVID-19, such as "nurse" and "C-section". during COVID-19, however, they were not only "nurse" but also "emergency" and "gown" related to pandemics. Conclusion: Various topics were being uploaded into the internet media. Nursing professionals should be interested in the text that is revealed in the internet media and try to continuously identify and improve problems.

The Study on the Meaning Change of 'Startup' and 'Entrepreneurship' using the Bigdata-based Corpus Network Analysis (빅데이터 기반 어휘연결망분석을 활용한 '창업'과 '기업가정신'의 의미변화연구)

  • Kim, Yeonjong;Park, Sanghyeok
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.16 no.4
    • /
    • pp.75-93
    • /
    • 2020
  • The purpose of this study is to extract keywords for 'startup' and 'entrepreneurship' from Naver news articles in Korea since 1990 and Google news articles in foreign countries, and to understand the changes in the meaning of entrepreneurship and entrepreneurship in each era It is aimed at doing. In summary, first, in terms of the frequency of keywords, venture sprouting is a sample of the entrepreneurial spirit of the government-led and entrepreneurs' chairman, and various technology investments and investments in corporate establishment have been made. It can be seen that training for the development of items and items was carried out, and in the case of the venture re-emergence period, it can be seen that the youth-oriented entrepreneurship and innovation through the development of various educational programs were emphasized. Second, in the result of vocabulary network analysis, the network connection and centrality of keywords in the leap period tended to be stronger than in the germination period, but the re-leap period tended to return to the level of germination. Third, in topic analysis, it can be seen that Naver keyword topics are mostly business-related content related to support, policy, and education, whereas topics through Google News consist of major keywords that are more specifically applicable to practical work.