• Title/Summary/Keyword: Topic Data

Search Result 1,572, Processing Time 0.025 seconds

A Study on the Categorizes of School Bullying through Topic Modelling Method (토픽모델링 기반의 학교폭력 사례 유형 연구)

  • Shin, Seungki
    • 한국정보교육학회:학술대회논문집
    • /
    • 2021.08a
    • /
    • pp.181-185
    • /
    • 2021
  • As part of an effort to derive measures to prevent school violence, which is continuously emphasized in the school field, this study tried to examine the topic that has recently become an issue related to school violence from the perspective of data science. In particular, it was attempted to crawl posts related to school violence using online SNS data and examine the characteristics of each type by using the topic modeling method. As a result of arranging the keywords for each topic derived from the topic modeling analysis by type, it was possible to divide the contents into three main categories: prevention of school violence, punishment of perpetrators, and measures to be taken. First, as the contents of school violence prevention activities, it is the contents of the role of specialized organizations for the prevention of school violence. Second, it was derived from the contents of measures and procedures for school violence. Third, it was possible to examine the contents of recent issues of school violence. In future research, it is necessary to conduct research that is used to solve the social problems facing based on data-based prediction.

  • PDF

A Study on Ontology and Topic Modeling-based Multi-dimensional Knowledge Map Services (온톨로지와 토픽모델링 기반 다차원 연계 지식맵 서비스 연구)

  • Jeong, Hanjo
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.79-92
    • /
    • 2015
  • Knowledge map is widely used to represent knowledge in many domains. This paper presents a method of integrating the national R&D data and assists of users to navigate the integrated data via using a knowledge map service. The knowledge map service is built by using a lightweight ontology and a topic modeling method. The national R&D data is integrated with the research project as its center, i.e., the other R&D data such as research papers, patents, and reports are connected with the research project as its outputs. The lightweight ontology is used to represent the simple relationships between the integrated data such as project-outputs relationships, document-author relationships, and document-topic relationships. Knowledge map enables us to infer further relationships such as co-author and co-topic relationships. To extract the relationships between the integrated data, a Relational Data-to-Triples transformer is implemented. Also, a topic modeling approach is introduced to extract the document-topic relationships. A triple store is used to manage and process the ontology data while preserving the network characteristics of knowledge map service. Knowledge map can be divided into two types: one is a knowledge map used in the area of knowledge management to store, manage and process the organizations' data as knowledge, the other is a knowledge map for analyzing and representing knowledge extracted from the science & technology documents. This research focuses on the latter one. In this research, a knowledge map service is introduced for integrating the national R&D data obtained from National Digital Science Library (NDSL) and National Science & Technology Information Service (NTIS), which are two major repository and service of national R&D data servicing in Korea. A lightweight ontology is used to design and build a knowledge map. Using the lightweight ontology enables us to represent and process knowledge as a simple network and it fits in with the knowledge navigation and visualization characteristics of the knowledge map. The lightweight ontology is used to represent the entities and their relationships in the knowledge maps, and an ontology repository is created to store and process the ontology. In the ontologies, researchers are implicitly connected by the national R&D data as the author relationships and the performer relationships. A knowledge map for displaying researchers' network is created, and the researchers' network is created by the co-authoring relationships of the national R&D documents and the co-participation relationships of the national R&D projects. To sum up, a knowledge map-service system based on topic modeling and ontology is introduced for processing knowledge about the national R&D data such as research projects, papers, patent, project reports, and Global Trends Briefing (GTB) data. The system has goals 1) to integrate the national R&D data obtained from NDSL and NTIS, 2) to provide a semantic & topic based information search on the integrated data, and 3) to provide a knowledge map services based on the semantic analysis and knowledge processing. The S&T information such as research papers, research reports, patents and GTB are daily updated from NDSL, and the R&D projects information including their participants and output information are updated from the NTIS. The S&T information and the national R&D information are obtained and integrated to the integrated database. Knowledge base is constructed by transforming the relational data into triples referencing R&D ontology. In addition, a topic modeling method is employed to extract the relationships between the S&T documents and topic keyword/s representing the documents. The topic modeling approach enables us to extract the relationships and topic keyword/s based on the semantics, not based on the simple keyword/s. Lastly, we show an experiment on the construction of the integrated knowledge base using the lightweight ontology and topic modeling, and the knowledge map services created based on the knowledge base are also introduced.

Accelerated Loarning of Latent Topic Models by Incremental EM Algorithm (점진적 EM 알고리즘에 의한 잠재토픽모델의 학습 속도 향상)

  • Chang, Jeong-Ho;Lee, Jong-Woo;Eom, Jae-Hong
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.12
    • /
    • pp.1045-1055
    • /
    • 2007
  • Latent topic models are statistical models which automatically captures salient patterns or correlation among features underlying a data collection in a probabilistic way. They are gaining an increased popularity as an effective tool in the application of automatic semantic feature extraction from text corpus, multimedia data analysis including image data, and bioinformatics. Among the important issues for the effectiveness in the application of latent topic models to the massive data set is the efficient learning of the model. The paper proposes an accelerated learning technique for PLSA model, one of the popular latent topic models, by an incremental EM algorithm instead of conventional EM algorithm. The incremental EM algorithm can be characterized by the employment of a series of partial E-steps that are performed on the corresponding subsets of the entire data collection, unlike in the conventional EM algorithm where one batch E-step is done for the whole data set. By the replacement of a single batch E-M step with a series of partial E-steps and M-steps, the inference result for the previous data subset can be directly reflected to the next inference process, which can enhance the learning speed for the entire data set. The algorithm is advantageous also in that it is guaranteed to converge to a local maximum solution and can be easily implemented just with slight modification of the existing algorithm based on the conventional EM. We present the basic application of the incremental EM algorithm to the learning of PLSA and empirically evaluate the acceleration performance with several possible data partitioning methods for the practical application. The experimental results on a real-world news data set show that the proposed approach can accomplish a meaningful enhancement of the convergence rate in the learning of latent topic model. Additionally, we present an interesting result which supports a possible synergistic effect of the combination of incremental EM algorithm with parallel computing.

Investigations on Techniques and Applications of Text Analytics (텍스트 분석 기술 및 활용 동향)

  • Kim, Namgyu;Lee, Donghoon;Choi, Hochang;Wong, William Xiu Shun
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.42 no.2
    • /
    • pp.471-492
    • /
    • 2017
  • The demand and interest in big data analytics are increasing rapidly. The concepts around big data include not only existing structured data, but also various kinds of unstructured data such as text, images, videos, and logs. Among the various types of unstructured data, text data have gained particular attention because it is the most representative method to describe and deliver information. Text analysis is generally performed in the following order: document collection, parsing and filtering, structuring, frequency analysis, and similarity analysis. The results of the analysis can be displayed through word cloud, word network, topic modeling, document classification, and semantic analysis. Notably, there is an increasing demand to identify trending topics from the rapidly increasing text data generated through various social media. Thus, research on and applications of topic modeling have been actively carried out in various fields since topic modeling is able to extract the core topics from a huge amount of unstructured text documents and provide the document groups for each different topic. In this paper, we review the major techniques and research trends of text analysis. Further, we also introduce some cases of applications that solve the problems in various fields by using topic modeling.

A study on the adaptive query conversion using TMDR-based global query (TMDR 기반의 글로벌 쿼리를 이용한 적응적 쿼리 변환에 관한 연구)

  • Hwang, Chi-Gon;Shin, Hyo-Young;Jung, Kye-Dong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2012.10a
    • /
    • pp.966-969
    • /
    • 2012
  • This study suggests a query conversion method based on Topic Maps MetaData Registry(TMDR) in order to solve heterogeneity problems distributed in networks and to integrate data efficiently. In order to integrate distributed data, TMDR provides global schema and it solves heterogeneity problem within local data using query conversion method. After analyzing relationship between Meta Schema Ontology(MSO) of eXtended Meta Data Registry(XMDR) and Topic Maps, this method allows integrated access through Meta Location(ML) which manages accessing information of local data. The processing method is to produce a global query for global processing by using TMDR and then to make the produced global query approach to systems distributed through networks so that allows integrated access at the end. For this, we propose a method to convert a global query into a query which is adaptive to local query.

  • PDF

Investigation of Research Trends in the D(Data)·N(Network)·A(A.I) Field Using the Dynamic Topic Model (다이나믹 토픽 모델을 활용한 D(Data)·N(Network)·A(A.I) 중심의 연구동향 분석)

  • Wo, Chang Woo;Lee, Jong Yun
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.9
    • /
    • pp.21-29
    • /
    • 2020
  • The Topic Modeling research, the methodology for deduction keyword within literature, has become active with the explosion of data from digital society transition. The research objective is to investigate research trends in D.N.A.(Data, Network, Artificial Intelligence) field using DTM(Dynamic Topic Model). DTM model was applied to the 1,519 of research projects with SW·A.I technology classifications among ICT(Information and Communication Technology) field projects between 6 years(2015~2020). As a result, technology keyword for D.N.A. field; Big data, Cloud, Artificial Intelligence, extended keyword; Unstructured, Edge Computing, Learning, Recognition was appeared every year, and accordingly that the above technology is being researched inclusively from other projects can be inferred. Finally, it is expected that the result from this paper become useful for future policy·R&D planning and corporation's technology·marketing strategy.

A Study on Intonation of the Topic in English Information Structure (영어 정보구조에서의 화제에 대한 억양 연구)

  • Lee, Yong-Jae;Kim, Hwa-Young
    • Speech Sciences
    • /
    • v.13 no.2
    • /
    • pp.87-105
    • /
    • 2006
  • Many researchers have studied the relationship between the information structure and intonation. Arguments about the relations between the information structure and intonation researched so far can be summarized as follows: the intonation of topic and focus in English information structure is represented as i) a pitch accent, ii) a tune (a pitch accent + an edge tone), or iii) a boundary tone. The purpose of this paper is to study various informational patterns of the topic in English information structure, using real TV discussion data. In this paper, the topic is classified as contrastive topics and non-contrastive topics, based on contrastiveness. The results show that the intonation of the topic in English information structure is implemented as a pitch accent, neither a tune nor a boundary tone. Of the non-contrastive topics, while anaphoric determinative NP topics (Lnc, Lncd) are mainly represented as a H* pitch accent, the pronoun topic(Lp) does not have a pitch accent. Of contrastive topics, while the semantically focused topic(Lci) is mainly represented as a H* pitch accent, the contrastively focused topic(Lcc) is represented as both H* and L+H* pitch accents. It shows that it is not always true that the topic or focus to have the meaning of contrast is represented as a L+H* pitch accent as argued in the previous researches.

  • PDF

Research on Service Enhancement Approach based on Super App Review Data using Topic Modeling (슈퍼앱 리뷰 토픽모델링을 통한 서비스 강화 방안 연구)

  • Jewon Yoo;Chie Hoon Song
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.27 no.2_2
    • /
    • pp.343-356
    • /
    • 2024
  • Super app is an application that provides a variety of services in a unified interface within a single platform. With the acceleration of digital transformation, super apps are becoming more prevalent. This study aims to suggest service enhancement measures by analyzing the user review data before and after the transition to a super app. To this end, user review data from a payment-based super app(Shinhan Play) were collected and studied via topic modeling. Moreover, a matrix for assessing the importance and usefulness of topics is introduced, which relies on the eigenvector centrality of the inter-topic network obtained through topic modeling and the number of review recommendations. This allowed us to identify and categorize topics with high utility and impact. Prior to the transition, the factors contributing to user satisfaction included 'payment service,' 'additional service,' and 'improvement.' Following the transition, user satisfaction was associated with 'payment service' and 'integrated UX.' Conversely, dissatisfaction factors before the transition encompassed issues related to 'signup/installation,' 'payment error/response,' 'security authentication,' and 'security error.' Following the transition, user dissatisfaction arose from concerns regarding 'update/error response' and 'UX/UI.' The research results are expected to be used as a basis for establishing strategies to strengthen service competitiveness by making super app services more user-oriented.

Bridge for Exchange of Data and Service Invocation Between OPRoS and ROS (OPRoS-ROS간 데이터 교환 및 서비스 호출을 위한 브리지)

  • Lee, Ki Woon;Park, Hong Seong
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.22 no.2
    • /
    • pp.153-161
    • /
    • 2016
  • This paper proposes a bridge model for data exchange and service invocation between OPRoS and ROS platforms, shows the validity of the proposed model via applications, and compares the proposed model with the OPRoS platform and the ROS platform using performance measures such as data exchange time and service response time. The proposed model operates independently of OPRoS and ROS Platforms using its configuration file with mapping information among the OPRoS data/service port and the ROS topic/service. The configuration file makes easy connections between OPRoS data/service and ROS topic/service without changing the source code of the platform and components.

Identification of Convergence Trend in the Field of Business Model Based on Patents (특허 데이터 기반 비즈니스 모델 분야 융합 트렌드 파악)

  • Sunho Lee;Chie Hoon Song
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.27 no.3
    • /
    • pp.635-644
    • /
    • 2024
  • Although the business model(BM) patents act as a creative bridge between technology and the marketplace, limited scholarly attention has been paid to the content analysis of BM patents. This study aims to contextualize converging BM patents by employing topic modeling technique and clustering highly marketable topics, which are expressed through a topic-market impact matrix. We relied on BM patent data filed between 2010 and 2022 to derive empirical insights into the commercial potential of emerging business models. Subsequently, nine topics were identified, including but not limited to "Data Analytics and Predictive Modeling" and "Mobile-Based Digital Services and Advertising." The 2x2 matrix allows to position topics based on the variables of topic growth rate and market impact, which is useful for prioritizing areas that require attention or are promising. This study differentiates itself by going beyond simple topic classification based on topic modeling, reorganizing the findings into a matrix format. T he results of this study are expected to serve as a valuable reference for companies seeking to innovate their business models and enhance their competitive positioning.