• Title/Summary/Keyword: Topic Data

Search Result 1,587, Processing Time 0.026 seconds

Extraction of Latent Topic-based Communities in Blogspace (블로그 월드에서 주제 중심의 잠재적 커뮤니티 추출 방안)

  • Shin, Jung-Hwan;Yoon, Seok-Ho;Kim, Sang-Wook;Park, Sun-Ju
    • Journal of KIISE:Databases
    • /
    • v.37 no.1
    • /
    • pp.56-69
    • /
    • 2010
  • In blogspace, there are posts that deal with a common topic and bloggers that are interested in these posts. In this paper, we define a blog community as a group of these bloggers and posts. With a blog community, we can establish various business policies for target marketing, sharing high quality data, and mobilizing the activities in the blogspace. Unlike internet cafes, bloggers participate in blog communities without explicit membership. So, it is not easy to identify the members of a community. In this paper, we propose an effective approach for extracting a blog community that is related to a given topic. First, we choose seed posts that is highly related to a given topic, and select bloggers that are related to the topic with the seed posts. Then, we select posts that are related to the topic with the selected bloggers. By repeating this, we find all the posts and bloggers that are members of the community related to a given topic in blogspace. We verify the superiority of the proposed approach by analyzing extracted blog communities.

Research Trend Analysis on Smart healthcare by using Topic Modeling and Ego Network Analysis (토픽모델링과 에고 네트워크 분석을 활용한 스마트 헬스케어 연구동향 분석)

  • Yoon, Jee-Eun;Suh, Chang-Jin
    • Journal of Digital Contents Society
    • /
    • v.19 no.5
    • /
    • pp.981-993
    • /
    • 2018
  • Smart healthcare is convergence of ICT and healthcare services, and interdisciplinary research has been actively conducted in various fields. The objective of this study is to investigate trends of smart healthcare research using topic modeling and ego network analysis. Text analysis, frequency analysis, topic modeling, word cloud, and ego network analysis were conducted for the abstracts of 2,690 articles in Scopus from 2001 to April 2018. Topic Modeling analysis resulted in eight topics, Topics included "AI in healthcare", "Smart hospital", "Healthcare platform", "Blockchain in healthcare", "Smart health data", "Mobile healthcare", " Wellness care", "Cognitive healthcare". In order to examine the topic modeling results core deeply, we analyzed word cloud and ego network analysis for eight topics. This study aims to identify trends in smart healthcare research and suggest implications for establishing future research direction.

Research trends over 10 years (2010-2021) in infant and toddler rearing behavior by family caregivers in South Korea: text network and topic modeling

  • In-Hye Song;Kyung-Ah Kang
    • Child Health Nursing Research
    • /
    • v.29 no.3
    • /
    • pp.182-194
    • /
    • 2023
  • Purpose: This study analyzed research trends in infant and toddler rearing behavior among family caregivers over a 10-year period (2010-2021). Methods: Text network analysis and topic modeling were employed on data collected from relevant papers, following the extraction and refinement of semantic morphemes. A semantic-centered network was constructed by extracting words from 2,613 English-language abstracts. Data analysis was performed using NetMiner 4.5.0. Results: Frequency analysis, degree centrality, and eigenvector centrality all revealed the terms ''scale," ''program," and ''education" among the top 10 keywords associated with infant and toddler rearing behaviors among family caregivers. The keywords extracted from the analysis were divided into two clusters through cohesion analysis. Additionally, they were classified into two topic groups using topic modeling: "program and evaluation" (64.37%) and "caregivers' role and competency in child development" (35.63%). Conclusion: The roles and competencies of family caregivers are essential for the development of infants and toddlers. Intervention programs and evaluations are necessary to improve rearing behaviors. Future research should determine the role of nurses in supporting family caregivers. Additionally, it should facilitate the development of nursing strategies and intervention programs to promote positive rearing practices.

Research Trend on Diabetes Mobile Applications: Text Network Analysis and Topic Modeling (당뇨병 모바일 앱 관련 연구동향: 텍스트 네트워크 분석 및 토픽 모델링)

  • Park, Seungmi;Kwak, Eunju;Kim, Youngji
    • Journal of Korean Biological Nursing Science
    • /
    • v.23 no.3
    • /
    • pp.170-179
    • /
    • 2021
  • Purpose: The aim of this study was to identify core keywords and topic groups in the 'Diabetes mellitus and mobile applications' field of research for better understanding research trends in the past 20 years. Methods: This study was a text-mining and topic modeling study including four steps such as 'collecting abstracts', 'extracting and cleaning semantic morphemes', 'building a co-occurrence matrix', and 'analyzing network features and clustering topic groups'. Results: A total of 789 papers published between 2002 and 2021 were found in databases (Springer). Among them, 435 words were extracted from 118 articles selected according to the conditions: 'analyzed by text network analysis and topic modeling'. The core keywords were 'self-management', 'intervention', 'health', 'support', 'technique' and 'system'. Through the topic modeling analysis, four themes were derived: 'intervention', 'blood glucose level control', 'self-management' and 'mobile health'. The main topic of this study was 'self-management'. Conclusion: While more recent work has investigated mobile applications, the highest feature was related to self-management in the diabetes care and prevention. Nursing interventions utilizing mobile application are expected to not only effective and powerful glycemic control and self-management tools, but can be also used for patient-driven lifestyle modification.

A Study on Automatic Generation Method of DDS Communication Class to Improve the Efficiency of Development of DDS-based Application Software (DDS 기반 응용 SW 개발의 효율성 향상을 위한 DDS 통신 클래스 자동생성 방법 연구)

  • Kim, Keun-hee;Kim, Ho-nyun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.05a
    • /
    • pp.93-96
    • /
    • 2017
  • DDS (Data Distribution Serivce) communication middleware is spreading to various private sector as well as the defense sector because it can obtain a very high application effect in a complex system environment in which a plurality of data producers and data consumers are connected by a network. However, application development using DDS middleware is an inefficient structure with a lot of repetitive codes because most users perform 1: 1 mapping with the message they want to exchange. Accordingly, the user has to perform unnecessary repetitive tasks as the topic increases. Therefore, a development support tool that identifies a series of processes required for using DDS middleware and automatically generates the classes that are repeated by Topic is required. In this paper, we propose a method for DDS communication by automatically generating a common class for efficient use of DDS middleware.

  • PDF

Trend Analysis of Pet Plants Before and After COVID-19 Outbreak Using Topic Modeling: Focusing on Big Data of News Articles from 2018 to 2021

  • Park, Yumin;Shin, Yong-Wook
    • Journal of People, Plants, and Environment
    • /
    • v.24 no.6
    • /
    • pp.563-572
    • /
    • 2021
  • Background and objective: The ongoing COVID-19 pandemic restricted daily life, forcing people to spend time indoors. With the growing interest in mental health issues and residential environments, 'pet plants' have been receiving attention during the unprecedented social distancing measures. This study aims to analyze the change in trends of pet plants before and during the COVID-19 pandemic and provide basic data for studies related to pet plants and directions of future development. Methods: A total of 2,016 news articles using the keyword 'pet plants' were collected on Naver News from January 1, 2018 to August 15, 2019 (609 articles) and January 1, 2020 to August 15, 2021 (1,407 articles). The texts were tokenized into words using KoNLPy package, ultimately coming up with 63,597 words. The analyses included frequency of keywords and topic modeling based on Latent Dirichlet Allocation (LDA) to identify the inherent meanings of related words and each topic. Results: Topic modeling generated three topics in each period (before and during the COVID-19), and the results showed that pet plants in daily life have become the object of 'emotional support' and 'healing' during social distancing. In particular, pet plants, which had been distributed as a solution to prevent solitary deaths and depression among seniors living alone, are now expanded to help resolve the social isolation of the general public suffering from COVID-19. The new term 'plant butler' became a new trend, and there was a change in the trend in which people shared their hobbies and information about pet plants and communicated with others in online. Conclusion: Based on these findings, the trend data of pet plants before and after the outbreak of COVID-19 can provide the basis for activating research on pet plants and setting the direction for development of related industries considering the continuous popularity and trend of indoor gardening and green hobby.

Feature selection for text data via topic modeling (토픽 모형을 이용한 텍스트 데이터의 단어 선택)

  • Woosol, Jang;Ye Eun, Kim;Won, Son
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.6
    • /
    • pp.739-754
    • /
    • 2022
  • Usually, text data consists of many variables, and some of them are closely correlated. Such multi-collinearity often results in inefficient or inaccurate statistical analysis. For supervised learning, one can select features by examining the relationship between target variables and explanatory variables. On the other hand, for unsupervised learning, since target variables are absent, one cannot use such a feature selection procedure as in supervised learning. In this study, we propose a word selection procedure that employs topic models to find latent topics. We substitute topics for the target variables and select terms which show high relevance for each topic. Applying the procedure to real data, we found that the proposed word selection procedure can give clear topic interpretation by removing high-frequency words prevalent in various topics. In addition, we observed that, by applying the selected variables to the classifiers such as naïve Bayes classifiers and support vector machines, the proposed feature selection procedure gives results comparable to those obtained by using class label information.

A Reply Graph-based Social Mining Method with Topic Modeling (토픽 모델링을 이용한 댓글 그래프 기반 소셜 마이닝 기법)

  • Lee, Sang Yeon;Lee, Keon Myung
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.6
    • /
    • pp.640-645
    • /
    • 2014
  • Many people use social network services as to communicate, to share an information and to build social relationships between others on the Internet. Twitter is such a representative service, where millions of tweets are posted a day and a huge amount of data collection has been being accumulated. Social mining that extracts the meaningful information from the massive data has been intensively studied. Typically, Twitter easily can deliver and retweet the contents using the following-follower relationships. Topic modeling in tweet data is a good tool for issue tracking in social media. To overcome the restrictions of short contents in tweets, we introduce a notion of reply graph which is constructed as a graph structure of which nodes correspond to users and of which edges correspond to existence of reply and retweet messages between the users. The LDA topic model, which is a typical method of topic modeling, is ineffective for short textual data. This paper introduces a topic modeling method that uses reply graph to reduce the number of short documents and to improve the quality of mining results. The proposed model uses the LDA model as the topic modeling framework for tweet issue tracking. Some experimental results of the proposed method are presented for a collection of Twitter data of 7 days.

Unstructured Data Processing Using Keyword-Based Topic-Oriented Analysis (키워드 기반 주제중심 분석을 이용한 비정형데이터 처리)

  • Ko, Myung-Sook
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.11
    • /
    • pp.521-526
    • /
    • 2017
  • Data format of Big data is diverse and vast, and its generation speed is very fast, requiring new management and analysis methods, not traditional data processing methods. Textual mining techniques can be used to extract useful information from unstructured text written in human language in online documents on social networks. Identifying trends in the message of politics, economy, and culture left behind in social media is a factor in understanding what topics they are interested in. In this study, text mining was performed on online news related to a given keyword using topic - oriented analysis technique. We use Latent Dirichiet Allocation (LDA) to extract information from web documents and analyze which subjects are interested in a given keyword, and which topics are related to which core values are related.

Research Trends Analysis of Big Data: Focused on the Topic Modeling (빅데이터 연구동향 분석: 토픽 모델링을 중심으로)

  • Park, Jongsoon;Kim, Changsik
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.15 no.1
    • /
    • pp.1-7
    • /
    • 2019
  • The objective of this study is to examine the trends in big data. Research abstracts were extracted from 4,019 articles, published between 1995 and 2018, on Web of Science and were analyzed using topic modeling and time series analysis. The 20 single-term topics that appeared most frequently were as follows: model, technology, algorithm, problem, performance, network, framework, analytics, management, process, value, user, knowledge, dataset, resource, service, cloud, storage, business, and health. The 20 multi-term topics were as follows: sense technology architecture (T10), decision system (T18), classification algorithm (T03), data analytics (T17), system performance (T09), data science (T06), distribution method (T20), service dataset (T19), network communication (T05), customer & business (T16), cloud computing (T02), health care (T14), smart city (T11), patient & disease (T04), privacy & security (T08), research design (T01), social media (T12), student & education (T13), energy consumption (T07), supply chain management (T15). The time series data indicated that the 40 single-term topics and multi-term topics were hot topics. This study provides suggestions for future research.