• Title/Summary/Keyword: Keyword extraction

Search Result 189, Processing Time 0.025 seconds

Counseling Outcomes Research Trend Analysis Using Topic Modeling - Focus on 「Korean Journal of Counseling」 (토픽 모델링을 활용한 상담 성과 연구동향 분석 - 「상담학연구」 학술지를 중심으로)

  • Park, Kwi Hwa;Lee, Eun Young;Yune, So Jung
    • Journal of Digital Convergence
    • /
    • v.19 no.11
    • /
    • pp.517-523
    • /
    • 2021
  • The outcome of the consultation is important to both the counselor and the researcher. Analyzing the trends of research on the results of counseling that have been carried out so far will help to comprehensively structure the results of consultations. The purpose of this research is to analyze research trends in Korea, focusing on research related to the outcomes of counseling published in 「Korean Journal of Counseling」 from 2011 to 2021, which is one of the well-known academic journals in the field of counseling in Korea. This is to explore the direction of future research by navigating the knowledge structure of research. There were 197 studies used for analysis, and the final 339 keyword were extracted during the node extraction process and used for analysis. As a result of extracting potential topics using the LDA algorithm, "Measurement and evaluation of counseling outcomes", "emotions and mediate factors affecting interpersonal relationships", and "career stress and coping strategies" are the main topics. Identifying major topics through trend analysis of counseling performance research contributed to structuring counseling performance. In-depth research on these topics needs to continue thereafter.

Korean Part-Of-Speech Tagging by using Head-Tail Tokenization (Head-Tail 토큰화 기법을 이용한 한국어 품사 태깅)

  • Suh, Hyun-Jae;Kim, Jung-Min;Kang, Seung-Shik
    • Smart Media Journal
    • /
    • v.11 no.5
    • /
    • pp.17-25
    • /
    • 2022
  • Korean part-of-speech taggers decompose a compound morpheme into unit morphemes and attach part-of-speech tags. So, here is a disadvantage that part-of-speech for morphemes are over-classified in detail and complex word types are generated depending on the purpose of the taggers. When using the part-of-speech tagger for keyword extraction in deep learning based language processing, it is not required to decompose compound particles and verb-endings. In this study, the part-of-speech tagging problem is simplified by using a Head-Tail tokenization technique that divides only two types of tokens, a lexical morpheme part and a grammatical morpheme part that the problem of excessively decomposed morpheme was solved. Part-of-speech tagging was attempted with a statistical technique and a deep learning model on the Head-Tail tokenized corpus, and the accuracy of each model was evaluated. Part-of-speech tagging was implemented by TnT tagger, a statistical-based part-of-speech tagger, and Bi-LSTM tagger, a deep learning-based part-of-speech tagger. TnT tagger and Bi-LSTM tagger were trained on the Head-Tail tokenized corpus to measure the part-of-speech tagging accuracy. As a result, it showed that the Bi-LSTM tagger performs part-of-speech tagging with a high accuracy of 99.52% compared to 97.00% for the TnT tagger.

A Study on Follow-up Survey Methodology to Verify the Effectiveness of (<인생나눔교실> 사업의 효과 검증을 위한 추적 조사 방법론 연구 - 2017~2018년도 영상추적조사를 중심으로 -)

  • Lee, Dong Eun
    • Korean Association of Arts Management
    • /
    • no.53
    • /
    • pp.207-247
    • /
    • 2020
  • is a project for the senior generation with humanistic knowledge to become a mentor and communicate with them to present the wisdom and direction of life to the new generations of mentees based on various life experiences. has been expanding since 2015, starting with the pilot operation in 2014. In general, projects such as these are assessed to establish effectiveness indicators to verify effectiveness and to establish project management and development strategies. However, most of the evaluations have been conducted quantitatively and qualitatively based on the short-term duration of the project. Therefore, in the case of continuous projects such as , especially in the field of culture and arts where long-term effectiveness verification is required, the short-term evaluation is difficult to predict and judge the actual meaningful effects. In this regard, tried to examine the qualitative change of key participants in this project through the 2017 and 2018 image tracking survey. For this purpose, we adopted qualitative research methodology through interview video shooting, field shooting, and value coding as a research method suitable for the research subject. To analyze the results, first, the interview images were transcribed, keywords were extracted, value encoding works were matched with human psychological values, and the theoretical method was used to identify changes and to derive the meaning. In fact, despite the fact that the study conducted in this study was a follow-up survey, it remained a limitation that it analyzed the changed pattern in a rather short time of 2 years. However, this study systemized the specific methodology that researchers should conduct for follow-up and provided the flow of research at the present time when there is hardly a model for follow-up in the field of culture and arts education business in Korea as well as abroad. Significance can be derived from this point. In addition, it can be said that it has great significance in preparing the detailed system and case of comparative analysis methodology through value coding.

A study on the effect of tax evasion controversy on corporate values in internet news portals through big data analysis (빅데이터 분석을 통한 인터넷 뉴스 포털에서의 탈세 논란이 기업 가치에 미치는 영향 연구)

  • Lee, Sang-Min;Park, Myung-Ho;Kim, Byung-Jun;Park, Dae-Keun
    • Journal of Internet Computing and Services
    • /
    • v.22 no.6
    • /
    • pp.51-57
    • /
    • 2021
  • If a company's actions to save or avoid taxes are judged to be tax evasion rather than legal tax action by the tax authorities, the company will not only pay tax but also non-tax costs such as damage to corporate image and stock price decline due to a series of tax evasion-related news articles. Therefore, this study measures the frequency of occurrence of tax evasion controversial keywords in internet news portal as a factor to measure the severity of the case, and analyzes the effect of the frequency of occurrence on corporate value. In the Korean stock market, we crawl related articles from internet news portal by using keywords that are controversial for tax evasion targeting top companies based on market capitalization, and generate a time series of the frequency of occurrence of keywords about tax evasion by company and analyze the effect of frequency of appearance on book value versus market capitalization. Through panel regression and impulse response analysis, it is analyzed that the frequency of appearance has a negative effect on the market capitalization and the effect gradually decreases until 12 months. This study examines whether the tax evasion issue affects the corporate value of Korean companies and suggests that it is necessary to take these influences into account when entrepreneurs set up tax-planning schemes.

Movie Recommended System base on Analysis for the User Review utilizing Ontology Visualization (온톨로지 시각화를 활용한 사용자 리뷰 분석 기반 영화 추천 시스템)

  • Mun, Seong Min;Kim, Gi Nam;Choi, Gyeong cheol;Lee, Kyung Won
    • Design Convergence Study
    • /
    • v.15 no.2
    • /
    • pp.347-368
    • /
    • 2016
  • Recently, researches for the word of mouth(WOM) imply that consumers use WOM informations of products in their purchase process. This study suggests methods using opinion mining and visualization to understand consumers' opinion of each goods and each markets. For this study we conduct research that includes developing domain ontology based on reviews confined to "movie" category because people who want to have watching movie refer other's movie reviews recently, and it is analyzed by opinion mining and visualization. It has differences comparing other researches as conducting attribution classification of evaluation factors and comprising verbal dictionary about evaluation factors when we conduct ontology process for analyzing. We want to prove through the result if research method will be valid. Results derived from this study can be largely divided into three. First, This research explains methods of developing domain ontology using keyword extraction and topic modeling. Second, We visualize reviews of each movie to understand overall audiences' opinion about specific movies. Third, We find clusters that consist of products which evaluated similar assessments in accordance with the evaluation results for the product. Case study of this research largely shows three clusters containing 130 movies that are used according to audiences'opinion.

Understanding of Generative Artificial Intelligence Based on Textual Data and Discussion for Its Application in Science Education (텍스트 기반 생성형 인공지능의 이해와 과학교육에서의 활용에 대한 논의)

  • Hunkoog Jho
    • Journal of The Korean Association For Science Education
    • /
    • v.43 no.3
    • /
    • pp.307-319
    • /
    • 2023
  • This study aims to explain the key concepts and principles of text-based generative artificial intelligence (AI) that has been receiving increasing interest and utilization, focusing on its application in science education. It also highlights the potential and limitations of utilizing generative AI in science education, providing insights for its implementation and research aspects. Recent advancements in generative AI, predominantly based on transformer models consisting of encoders and decoders, have shown remarkable progress through optimization of reinforcement learning and reward models using human feedback, as well as understanding context. Particularly, it can perform various functions such as writing, summarizing, keyword extraction, evaluation, and feedback based on the ability to understand various user questions and intents. It also offers practical utility in diagnosing learners and structuring educational content based on provided examples by educators. However, it is necessary to examine the concerns regarding the limitations of generative AI, including the potential for conveying inaccurate facts or knowledge, bias resulting from overconfidence, and uncertainties regarding its impact on user attitudes or emotions. Moreover, the responses provided by generative AI are probabilistic based on response data from many individuals, which raises concerns about limiting insightful and innovative thinking that may offer different perspectives or ideas. In light of these considerations, this study provides practical suggestions for the positive utilization of AI in science education.

Twitter Issue Tracking System by Topic Modeling Techniques (토픽 모델링을 이용한 트위터 이슈 트래킹 시스템)

  • Bae, Jung-Hwan;Han, Nam-Gi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.109-122
    • /
    • 2014
  • People are nowadays creating a tremendous amount of data on Social Network Service (SNS). In particular, the incorporation of SNS into mobile devices has resulted in massive amounts of data generation, thereby greatly influencing society. This is an unmatched phenomenon in history, and now we live in the Age of Big Data. SNS Data is defined as a condition of Big Data where the amount of data (volume), data input and output speeds (velocity), and the variety of data types (variety) are satisfied. If someone intends to discover the trend of an issue in SNS Big Data, this information can be used as a new important source for the creation of new values because this information covers the whole of society. In this study, a Twitter Issue Tracking System (TITS) is designed and established to meet the needs of analyzing SNS Big Data. TITS extracts issues from Twitter texts and visualizes them on the web. The proposed system provides the following four functions: (1) Provide the topic keyword set that corresponds to daily ranking; (2) Visualize the daily time series graph of a topic for the duration of a month; (3) Provide the importance of a topic through a treemap based on the score system and frequency; (4) Visualize the daily time-series graph of keywords by searching the keyword; The present study analyzes the Big Data generated by SNS in real time. SNS Big Data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. In addition, such analysis requires the latest big data technology to process rapidly a large amount of real-time data, such as the Hadoop distributed system or NoSQL, which is an alternative to relational database. We built TITS based on Hadoop to optimize the processing of big data because Hadoop is designed to scale up from single node computing to thousands of machines. Furthermore, we use MongoDB, which is classified as a NoSQL database. In addition, MongoDB is an open source platform, document-oriented database that provides high performance, high availability, and automatic scaling. Unlike existing relational database, there are no schema or tables with MongoDB, and its most important goal is that of data accessibility and data processing performance. In the Age of Big Data, the visualization of Big Data is more attractive to the Big Data community because it helps analysts to examine such data easily and clearly. Therefore, TITS uses the d3.js library as a visualization tool. This library is designed for the purpose of creating Data Driven Documents that bind document object model (DOM) and any data; the interaction between data is easy and useful for managing real-time data stream with smooth animation. In addition, TITS uses a bootstrap made of pre-configured plug-in style sheets and JavaScript libraries to build a web system. The TITS Graphical User Interface (GUI) is designed using these libraries, and it is capable of detecting issues on Twitter in an easy and intuitive manner. The proposed work demonstrates the superiority of our issue detection techniques by matching detected issues with corresponding online news articles. The contributions of the present study are threefold. First, we suggest an alternative approach to real-time big data analysis, which has become an extremely important issue. Second, we apply a topic modeling technique that is used in various research areas, including Library and Information Science (LIS). Based on this, we can confirm the utility of storytelling and time series analysis. Third, we develop a web-based system, and make the system available for the real-time discovery of topics. The present study conducted experiments with nearly 150 million tweets in Korea during March 2013.

An Exploratory Study on the Status of and Demand for Higher Education Programs in Fashion in Myanmar (미얀마의 패션 고등교육 현황과 수요에 대한 탐색적 연구)

  • Kang, Min-Kyung;Jin, Byoungho Ellie;Cho, Ahra;Lee, Hyojeong;Lee, Jaeil;Lee, Yoon-Jung
    • Journal of Korean Home Economics Education Association
    • /
    • v.34 no.3
    • /
    • pp.1-23
    • /
    • 2022
  • This study examined the perceptions of Myanmar university students and professors regarding the status and necessity of higher education programs in fashion. Data were collected from professors in textile engineering at Yangon Technological University and Myanmar university students. Closed- and open-ended questions were asked either through interviews or by email. The responses were analyzed using keyword extraction and categorization, and descriptive statistics(closed questions). Generally, the professors perceived higher education, as well as the cultural industries including art and fashion, as important for Myanmar's social and economic development. According to the students interests in pursuing a degree in textile were limited, despite the high interest in fashion. Low wages in the apparel industry and lack of fashion degrees that meet the demand of students were cited as reasons. The demand was high for educational programs in fashion product development, fashion design, pattern-making, fashion marketing, branding, management, costume history, and cultural studies. Students expected to find their future career in textiles and clothing factories. Many students wanted to be hired by global fashion brands for higher salaries and training for advanced knowledge and technical skills. They perceived advanced fashion education programs will have various positive effects on Myanmar's national economy.

A study on the classification of research topics based on COVID-19 academic research using Topic modeling (토픽모델링을 활용한 COVID-19 학술 연구 기반 연구 주제 분류에 관한 연구)

  • Yoo, So-yeon;Lim, Gyoo-gun
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.155-174
    • /
    • 2022
  • From January 2020 to October 2021, more than 500,000 academic studies related to COVID-19 (Coronavirus-2, a fatal respiratory syndrome) have been published. The rapid increase in the number of papers related to COVID-19 is putting time and technical constraints on healthcare professionals and policy makers to quickly find important research. Therefore, in this study, we propose a method of extracting useful information from text data of extensive literature using LDA and Word2vec algorithm. Papers related to keywords to be searched were extracted from papers related to COVID-19, and detailed topics were identified. The data used the CORD-19 data set on Kaggle, a free academic resource prepared by major research groups and the White House to respond to the COVID-19 pandemic, updated weekly. The research methods are divided into two main categories. First, 41,062 articles were collected through data filtering and pre-processing of the abstracts of 47,110 academic papers including full text. For this purpose, the number of publications related to COVID-19 by year was analyzed through exploratory data analysis using a Python program, and the top 10 journals under active research were identified. LDA and Word2vec algorithm were used to derive research topics related to COVID-19, and after analyzing related words, similarity was measured. Second, papers containing 'vaccine' and 'treatment' were extracted from among the topics derived from all papers, and a total of 4,555 papers related to 'vaccine' and 5,971 papers related to 'treatment' were extracted. did For each collected paper, detailed topics were analyzed using LDA and Word2vec algorithms, and a clustering method through PCA dimension reduction was applied to visualize groups of papers with similar themes using the t-SNE algorithm. A noteworthy point from the results of this study is that the topics that were not derived from the topics derived for all papers being researched in relation to COVID-19 (

    ) were the topic modeling results for each research topic (
    ) was found to be derived from For example, as a result of topic modeling for papers related to 'vaccine', a new topic titled Topic 05 'neutralizing antibodies' was extracted. A neutralizing antibody is an antibody that protects cells from infection when a virus enters the body, and is said to play an important role in the production of therapeutic agents and vaccine development. In addition, as a result of extracting topics from papers related to 'treatment', a new topic called Topic 05 'cytokine' was discovered. A cytokine storm is when the immune cells of our body do not defend against attacks, but attack normal cells. Hidden topics that could not be found for the entire thesis were classified according to keywords, and topic modeling was performed to find detailed topics. In this study, we proposed a method of extracting topics from a large amount of literature using the LDA algorithm and extracting similar words using the Skip-gram method that predicts the similar words as the central word among the Word2vec models. The combination of the LDA model and the Word2vec model tried to show better performance by identifying the relationship between the document and the LDA subject and the relationship between the Word2vec document. In addition, as a clustering method through PCA dimension reduction, a method for intuitively classifying documents by using the t-SNE technique to classify documents with similar themes and forming groups into a structured organization of documents was presented. In a situation where the efforts of many researchers to overcome COVID-19 cannot keep up with the rapid publication of academic papers related to COVID-19, it will reduce the precious time and effort of healthcare professionals and policy makers, and rapidly gain new insights. We hope to help you get It is also expected to be used as basic data for researchers to explore new research directions.


  • (34141) Korea Institute of Science and Technology Information, 245, Daehak-ro, Yuseong-gu, Daejeon
    Copyright (C) KISTI. All Rights Reserved.