• Title/Summary/Keyword: topic model

Search Result 870, Processing Time 0.021 seconds

Examining Suicide Tendency Social Media Texts by Deep Learning and Topic Modeling Techniques (딥러닝 및 토픽모델링 기법을 활용한 소셜 미디어의 자살 경향 문헌 판별 및 분석)

  • Ko, Young Soo;Lee, Ju Hee;Song, Min
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.32 no.3
    • /
    • pp.247-264
    • /
    • 2021
  • This study aims to create a deep learning-based classification model to classify suicide tendency by suicide corpus constructed for the present study. Also, to analyze suicide factors, the study classified suicide tendency corpus into detailed topics by using topic modeling, an analysis technique that automatically extracts topics. For this purpose, 2,011 documents of the suicide-related corpus collected from social media naver knowledge iN were directly annotated into suicide-tendency documents or non-suicide-tendency documents based on suicide prevention education manual issued by the Central Suicide Prevention Center, and we also conducted the deep learning model(LSTM, BERT, ELECTRA) performance evaluation based on the classification model, using annotated corpus data. In addition, one of the topic modeling techniques, LDA identified suicide factors by classifying thematic literature, and co-word analysis and visualization were conducted to analyze the factors in-depth.

Comparison of policy perceptions between national R&D projects and standing committees using topic modeling analysis : focusing on the ICT field (토픽모델링 분석을 활용한 국가연구개발사업과제와 국회 상임위원회 사이의 정책 인식 비교 : ICT 분야를 중심으로)

  • Song, Byoungki;Kim, Sangung
    • Journal of Industrial Convergence
    • /
    • v.20 no.7
    • /
    • pp.1-11
    • /
    • 2022
  • In this paper, numerical values are derived using topic modeling among data-based evaluation methodologies discussed by various research institutes. In addition, we will focus on the ICT field to see if there is a difference in policy perception between the national R&D project and standing committee. First, we create model for classifying ICT documents by learning R&D project data using HAN model. And we perform LDA topic modeling analysis on ICT documents classified by applying the model, compare the distribution with the topics derived from the R&D project data and proceedings of standing committees. Specifically, a total of 26 topics were derived. Also, R&D project data had professionally topics, and the standing committee-discuss relatively social and popular issues. As the difference in perception can be numerically confirmed, it can be used as a basic study on indicators that can be used for future policy or project evaluation.

Understanding Black-Scholes Option Pricing Model

  • Lee, Eun-Kyung;Lee, Yoon-Dong
    • Communications for Statistical Applications and Methods
    • /
    • v.14 no.2
    • /
    • pp.459-479
    • /
    • 2007
  • Theories related to financial market has received big attention from the statistics community. However, not many courses on the topic are provided in statistics departments. Because the financial theories are entangled with many complicated mathematical and physical theories as well as ambiguously stated financial terminologies. Based on our experience on the topic, we try to explain the rather complicated terminologies and theories with easy-to-understand words. This paper will briefly cover the topics of basic terminologies of derivatives, Black-Scholes pricing idea, and related basic mathematical terminologies.

Development of Scaffolding Strategies Model by Information Search Process (ISP) (정보탐색과정(ISP)에 의한 스캐폴딩 전략 모형 개발)

  • Jeong-Hoon Lim
    • Journal of Korean Library and Information Science Society
    • /
    • v.54 no.1
    • /
    • pp.143-165
    • /
    • 2023
  • This study aims to propose a scaffolding strategy that can be applied to the information search process by using Kuhlthau's ISP model, which presented a design and implementation strategy for the mediation role in the learning process. To this end, the relevant literature was reviewed to categorize scaffolding strategies, and impressions were collected from the students surveys after providing 150 middle school students in the Daejeon area with the project class to which the scaffolding strategy based on the ISP model was applied. The collected data were processed into a form suitable for analysis through data preprocessing for word frequencies to be extracted, and topic analysis was performed using STM (Structural Topic Modeling). First, after determining the optimal number of topics and extracting topics for each stage of the ISP model, the extracted topics were classified into three types: cognitive domain-macro perspective, cognitive domain-micro perspective, and emotional domain perspective. In this process, we focused on cognitive verbs and emotional verbs among words extracted through text mining, and presented a scaffolding strategy model related to each topic by reviewing representative document cases. Based on the results of this study, if an appropriate scaffolding strategy is provided at the ISP model stage, a positive effect on learners' self-directed task solving can be expected.

Resolving Grammatical Marking Ambiguities of Korean: An Eye-tracking Study (안구운동 추적을 통한 한국어 중의성 해소과정 연구)

  • Kim Youngjin
    • Korean Journal of Cognitive Science
    • /
    • v.15 no.4
    • /
    • pp.49-59
    • /
    • 2004
  • An eye-tracking experiment was conducted to examine resolving processes of grammatical marking ambiguities of Korean. and to evaluate predictions from the garden-path model and the constraint-based models on the processing of Korean morphological information. The complex NP clause structure that can be parsed according to the minimal attachment principle was compared to the embedded relative clause structures that have one of the nominative marker (-ka), the delimiter (-man, which roughly corresponds to the English word 'only'), or the topic marker (-nun) on the first NPs. The results clearly showed that Korean marking ambiguities are resolved by the minimal attachment principle, and the topic marker affects reparsing procedures. The pattern of eye fixation times was more compatible with the garden-path model, and was not consistent with the predictions of the constraint-based accounts. Suggestions for further studies were made.

  • PDF

A Development of LDA Topic Association Systems Based on Spark-Hadoop Framework

  • Park, Kiejin;Peng, Limei
    • Journal of Information Processing Systems
    • /
    • v.14 no.1
    • /
    • pp.140-149
    • /
    • 2018
  • Social data such as users' comments are unstructured in nature and up-to-date technologies for analyzing such data are constrained by the available storage space and processing time when fast storing and processing is required. On the other hand, it is even difficult in using a huge amount of dynamically generated social data to analyze the user features in a high speed. To solve this problem, we design and implement a topic association analysis system based on the latent Dirichlet allocation (LDA) model. The LDA does not require the training process and thus can analyze the social users' hourly interests on different topics in an easy way. The proposed system is constructed based on the Spark framework that is located on top of Hadoop cluster. It is advantageous of high-speed processing owing to that minimized access to hard disk is required and all the intermediately generated data are processed in the main memory. In the performance evaluation, it requires about 5 hours to analyze the topics for about 1 TB test social data (SNS comments). Moreover, through analyzing the association among topics, we can track the hourly change of social users' interests on different topics.

A Study on the Analysis of R&D Trends and the Development of Logic Models for Autonomous Vehicles (자율주행자동차 R&D 동향분석과 논리모형 개발에 대한 연구)

  • Kim, Gil-Lae
    • Journal of Digital Convergence
    • /
    • v.19 no.5
    • /
    • pp.31-39
    • /
    • 2021
  • This study collected 1,870 English news articles related to research and development of autonomous vehicles in order to identify various issues emerging in the research and development process of autonomous vehicles at home and abroad, and conducted topic modeling after data pre-processing. As a result of topic modeling, we extracted 20 topics, and we performed naming operations for topics and interpreted their meanings. A logical model for autonomous vehicle research and development projects was presented in response to the R&D process of input, activity, output, and outcome of derived topics. The analysis results of this study will be used as basic data to accurately determine the progress of domestic and foreign self-driving car research and development projects and prepare for the rapidly changing technology development.

The development of knowledge service needs assessment model for small and medium-sized businesses (중소기업을 위한 지식서비스 수요 조사 모형 개발)

  • Maeng, Yun-ho;Yoo, Sun-Hi;Seo, Jinny
    • Knowledge Management Research
    • /
    • v.16 no.4
    • /
    • pp.169-190
    • /
    • 2015
  • The status of small and medium-sized enterprises has been changed into more independent business entities rather than simply subcontractor so that the utilization of specialized knowledge has been much more necessary for the survival in the market. However, small and medium-sized enterprises, it is difficult to sufficient investment in knowledge services due to limited resources relative to large enterprises and demand for knowledge services business of government support is growing. For this reason, it is important to measure accurately the demand for knowledge services of small and medium-sized enterprises in knowledge management for effective utilization of knowledge service. In this study, we analyzed previous studies on small and medium-sized enterprises knowledge services that can be utilized in a comprehensive way. As a result, we developed knowledge service needs assessment model based on five critical success factors for continual growth and 12 types of knowledge service. This model has been modified and supplemented through expert meeting using delphi research method and topic modeling analysis using secondary data. This study is attempted to appropriately measure necessary knowledge services for small and medium-sized enterprises so that generated the evaluation model of knowledge service demands, comprehensively dealing with core knowledge services for many kinds of business entities. It is expected that the developed model will be a useful tool to understand and evaluate knowledge services demands of enterprises.

Cross-Domain Text Sentiment Classification Method Based on the CNN-BiLSTM-TE Model

  • Zeng, Yuyang;Zhang, Ruirui;Yang, Liang;Song, Sujuan
    • Journal of Information Processing Systems
    • /
    • v.17 no.4
    • /
    • pp.818-833
    • /
    • 2021
  • To address the problems of low precision rate, insufficient feature extraction, and poor contextual ability in existing text sentiment analysis methods, a mixed model account of a CNN-BiLSTM-TE (convolutional neural network, bidirectional long short-term memory, and topic extraction) model was proposed. First, Chinese text data was converted into vectors through the method of transfer learning by Word2Vec. Second, local features were extracted by the CNN model. Then, contextual information was extracted by the BiLSTM neural network and the emotional tendency was obtained using softmax. Finally, topics were extracted by the term frequency-inverse document frequency and K-means. Compared with the CNN, BiLSTM, and gate recurrent unit (GRU) models, the CNN-BiLSTM-TE model's F1-score was higher than other models by 0.0147, 0.006, and 0.0052, respectively. Then compared with CNN-LSTM, LSTM-CNN, and BiLSTM-CNN models, the F1-score was higher by 0.0071, 0.0038, and 0.0049, respectively. Experimental results showed that the CNN-BiLSTM-TE model can effectively improve various indicators in application. Lastly, performed scalability verification through a takeaway dataset, which has great value in practical applications.

A Study on the Document Topic Extraction System Based on Big Data (빅데이터 기반 문서 토픽 추출 시스템 연구)

  • Hwang, Seung-Yeon;An, Yoon-Bin;Shin, Dong-Jin;Oh, Jae-Kon;Moon, Jin Yong;Kim, Jeong-Joon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.5
    • /
    • pp.207-214
    • /
    • 2020
  • Nowadays, the use of smart phones and various electronic devices is increasing, the Internet and SNS are activated, and we live in the flood of information. The amount of information has grown exponentially, making it difficult to look at a lot of information, and more and more people want to see only key keywords in a document, and the importance of research to extract topics that are the core of information is increasing. In addition, it is also an important issue to extract the topic and compare it with the past to infer the current trend. Topic modeling techniques can be used to extract topics from a large volume of documents, and these extracted topics can be used in various fields such as trend prediction and data analysis. In this paper, we inquire the topic of the three-year papers of 2016, 2017, and 2018 in the field of computing using the LDA algorithm, one of Probabilistic Topic Model Techniques, in order to analyze the rapidly changing trends and keep pace with the times. Then we analyze trends and flows of research.