• Title/Summary/Keyword: Topic Data

Search Result 1,587, Processing Time 0.034 seconds

Topic Modeling-Based Domestic and Foreign Public Data Research Trends Comparative Analysis (토픽 모델링 기반의 국내외 공공데이터 연구 동향 비교 분석)

  • Park, Dae-Yeong;Kim, Deok-Hyeon;Kim, Keun-Wook
    • Journal of Digital Convergence
    • /
    • v.19 no.2
    • /
    • pp.1-12
    • /
    • 2021
  • With the recent 4th Industrial Revolution, the growth and value of big data are continuously increasing, and the government is also actively making efforts to open and utilize public data. However, the situation still does not reach the level of demand for public data use by citizens, At this point, it is necessary to identify research trends in the public data field and seek directions for development. In this study, in order to understand the research trends related to public data, the analysis was performed using topic modeling, which is mainly used in text mining techniques. To this end, we collected papers containing keywords of 'Public data' among domestic and foreign research papers (1,437 domestically, 9,607 overseas) and performed topic modeling based on the LDA algorithm, and compared domestic and foreign public data research trends. After analysis, policy implications were presented. Looking at the time series by topic, research in the fields of 'personal information protection', 'public data management', and 'urban environment' has increased in Korea. Overseas, it was confirmed that research in the fields of 'urban policy', 'cell biology', 'deep learning', and 'cloud·security' is active.

Semi-Automatic Ontology Generation about XML Documents using Data Mining Method (데이터 마이닝 기법을 이용한 XML 문서의 온톨로지 반자동 생성)

  • Gu Mi-Sug;Hwang Jeong-Hee;Ryu Keun-Ho;Hong Jang-Eui
    • The KIPS Transactions:PartD
    • /
    • v.13D no.3 s.106
    • /
    • pp.299-308
    • /
    • 2006
  • As recently XML is becoming the standard of exchanging web documents and public documentations, XML data are increasing in many areas. To retrieve the information about XML documents efficiently, the semantic web based on the ontology is appearing. The existing ontology has been constructed manually and it was time and cost consuming. Therefore in this paper, we propose the semi-automatic ontology generation technique using the data mining technique, the association rules. The proposed method solves what type and how many conceptual relationships and determines the ontology domain level for the automatic ontology generation, using the data mining algorithm. Appying the association rules to the XML documents, we intend to find out the conceptual relationships to construct the ontology, finding the frequent patterns of XML tags in the XML documents. Using the conceptual ontology domain level extracted from the data mining, we implemented the semantic web based on the ontology by XML Topic Maps (XTM) and the topic map engine, TM4J.

Digital Transformation: Using D.N.A.(Data, Network, AI) Keywords Generalized DMR Analysis (디지털 전환: D.N.A.(Data, Network, AI) 키워드를 활용한 토픽 모델링)

  • An, Sehwan;Ko, Kangwook;Kim, Youngmin
    • Knowledge Management Research
    • /
    • v.23 no.3
    • /
    • pp.129-152
    • /
    • 2022
  • As a key infrastructure for digital transformation, the spread of data, network, artificial intelligence (D.N.A.) fields and the emergence of promising industries are laying the groundwork for active digital innovation throughout the economy. In this study, by applying the text mining methodology, major topics were derived by using the abstract, publication year, and research field of the study corresponding to the SCIE, SSCI, and A&HCI indexes of the WoS database as input variables. First, main keywords were identified through TF and TF-IDF analysis based on word appearance frequency, and then topic modeling was performed using g-DMR. With the advantage of the topic model that can utilize various types of variables as meta information, it was possible to properly explore the meaning beyond simply deriving a topic. According to the analysis results, topics such as business intelligence, manufacturing production systems, service value creation, telemedicine, and digital education were identified as major research topics in digital transformation. To summarize the results of topic modeling, 1) research on business intelligence has been actively conducted in all areas after COVID-19, and 2) issues such as intelligent manufacturing solutions and metaverses have emerged in the manufacturing field. It has been confirmed that the topic of production systems is receiving attention once again. Finally, 3) Although the topic itself can be viewed separately in terms of technology and service, it was found that it is undesirable to interpret it separately because a number of studies comprehensively deal with various services applied by combining the relevant technologies.

Research Trends on Doctor's Job Competencies in Korea Using Text Network Analysis (텍스트네트워크 분석을 활용한 국내 의사 직무역량 연구동향 분석)

  • Kim, Young Jon;Lee, Jea Woog;Yune, So Jung
    • Korean Medical Education Review
    • /
    • v.24 no.2
    • /
    • pp.93-102
    • /
    • 2022
  • We use the concept of the "doctor's role" as a guideline for developing medical education programs for medical students, residents, and doctors. Therefore, we should regularly reflect on the times and social needs to develop a clear sense of that role. The objective of the present study was to understand the knowledge structure related to doctor's job competencies in Korea. We analyzed research trends related to doctor's job competencies in Korea Citation Index journals using text network analysis through an integrative approach focusing on identifying social issues. We finally selected 1,354 research papers related to doctor's job competencies from 2011 to 2020, and we analyzed 2,627 words through data pre-processing with the NetMiner ver. 4.2 program (Cyram Inc., Seongnam, Korea). We conducted keyword centrality analysis, topic modeling, frequency analysis, and linear regression analysis using NetMiner ver. 4.2 (Cyram Inc.) and IBM SPSS ver. 23.0 (IBM Corp., Armonk, NY, USA). As a result of the study, words such as "family," "revision," and "rejection" appeared frequently. In topic modeling, we extracted five potential topics: "topic 1: Life and death in medical situations," "topic 2: Medical practice under the Medical Act," "topic 3: Medical malpractice and litigation," "topic 4: Medical professionalism," and "topic 5: Competency development education for medical students." Although there were no statistically significant changes in the research trends for each topic over time, it is nonetheless known that social changes could affect the demand for doctor's job competencies.

Research on Railway Safety Common Data Model and DDS Topic for Real-time Railway Safety Data Transmission

  • Park, Yunjung;Kim, Sang Ahm
    • Journal of the Korea Society of Computer and Information
    • /
    • v.21 no.5
    • /
    • pp.57-64
    • /
    • 2016
  • In this paper, we propose the design of railway safety common data model to provide common transformation method for collecting data from railway facility fields to Real-time railway safety monitoring and control system. This common data model is divided into five abstract sub-models according to the characteristics of data such as 'StateInfoMessage', 'ControlMessage', 'RequestMessage', 'ResponseMessage' and 'ExtendedXXXMessage'. This kind of model structure allows diverse heterogeneous data acquisitions and its common conversion method to DDS (Data Distribution Service) format to share data to the sub-systems of Real-time railway safety monitoring and control system. This paper contains the design of common data model and its DDS Topic expression for DDS communication, and presents two kinds of data transformation case studied for verification of the model design.

Jointly Image Topic and Emotion Detection using Multi-Modal Hierarchical Latent Dirichlet Allocation

  • Ding, Wanying;Zhu, Junhuan;Guo, Lifan;Hu, Xiaohua;Luo, Jiebo;Wang, Haohong
    • Journal of Multimedia Information System
    • /
    • v.1 no.1
    • /
    • pp.55-67
    • /
    • 2014
  • Image topic and emotion analysis is an important component of online image retrieval, which nowadays has become very popular in the widely growing social media community. However, due to the gaps between images and texts, there is very limited work in literature to detect one image's Topics and Emotions in a unified framework, although topics and emotions are two levels of semantics that often work together to comprehensively describe one image. In this work, a unified model, Joint Topic/Emotion Multi-Modal Hierarchical Latent Dirichlet Allocation (JTE-MMHLDA) model, which extends previous LDA, mmLDA, and JST model to capture topic and emotion information at the same time from heterogeneous data, is proposed. Specifically, a two level graphical structured model is built to realize sharing topics and emotions among the whole document collection. The experimental results on a Flickr dataset indicate that the proposed model efficiently discovers images' topics and emotions, and significantly outperform the text-only system by 4.4%, vision-only system by 18.1% in topic detection, and outperforms the text-only system by 7.1%, vision-only system by 39.7% in emotion detection.

  • PDF

Mobile Device and Virtual Storage-Based Approach to Automatically and Pervasively Acquire Knowledge in Dialogues (모바일 기기와 가상 스토리지 기술을 적용한 자동적 및 편재적 음성형 지식 획득)

  • Yoo, Kee-Dong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.1-17
    • /
    • 2012
  • The Smartphone, one of essential mobile devices widely used recently, can be very effectively applied to capture knowledge on the spot by jointly applying the pervasive functionality of cloud computing. The process of knowledge capturing can be also effectively automated if the topic of knowledge is automatically identified. Therefore, this paper suggests an interdisciplinary approach to automatically acquire knowledge on the spot by combining technologies of text mining-based topic identification and cloud computing-based Smartphone. The Smartphone is used not only as the recorder to record knowledge possessor's dialogue which plays the role of the knowledge source, but also as the sensor to collect knowledge possessor's context data which characterize specific situations surrounding him or her. The support vector machine, one of well-known outperforming text mining algorithms, is applied to extract the topic of knowledge. By relating the topic and context data, a business rule can be formulated, and by aggregating the rule, the topic, context data, and the dictated dialogue, a set of knowledge is automatically acquired.

Performance Analysis of TNS System for Improving DDS Discovery (DDS 검색 방식 개선을 위한 TNS 시스템 성능 분석)

  • Yoon, Gunjae;Choi, Jeonghyun;Choi, Hoon
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.14 no.6
    • /
    • pp.75-86
    • /
    • 2018
  • The DDS (Data Distribution Service) specification defines a discovery method for finding participants and endpoints in a DDS network. The standard discovery mechanism uses the multicast protocol and finds all the endpoints in the network. Because of using multicasting, discovery may fail in a network with different segments. Other problems include that memory space wastes due to storing information of all the endpoints. The Topic Name Service (TNS) solves these problems by unicasting only the endpoints, which are required for communication. However, an extra delay time is inevitable in components of TNS, i.e, a front-end server, topic name servers, and a terminal server. In this paper, we analyze the performance of TNS. Delay times in the servers of TNS and time required to receive endpoint information are measured. Time to finish discovery and number of receiving endpoints compare with the standard discovery method.

An Exploratory Analysis of Online Discussion of Library and Information Science Professionals in India using Text Mining

  • Garg, Mohit;Kanjilal, Uma
    • Journal of Information Science Theory and Practice
    • /
    • v.10 no.3
    • /
    • pp.40-56
    • /
    • 2022
  • This paper aims to implement a topic modeling technique for extracting the topics of online discussions among library professionals in India. Topic modeling is the established text mining technique popularly used for modeling text data from Twitter, Facebook, Yelp, and other social media platforms. The present study modeled the online discussions of Library and Information Science (LIS) professionals posted on Lis Links. The text data of these posts was extracted using a program written in R using the package "rvest." The data was pre-processed to remove blank posts, posts having text in non-English fonts, punctuation, URLs, emails, etc. Topic modeling with the Latent Dirichlet Allocation algorithm was applied to the pre-processed corpus to identify each topic associated with the posts. The frequency analysis of the occurrence of words in the text corpus was calculated. The results found that the most frequent words included: library, information, university, librarian, book, professional, science, research, paper, question, answer, and management. This shows that the LIS professionals actively discussed exams, research, and library operations on the forum of Lis Links. The study categorized the online discussions on Lis Links into ten topics, i.e. "LIS Recruitment," "LIS Issues," "Other Discussion," "LIS Education," "LIS Research," "LIS Exams," "General Information related to Library," "LIS Admission," "Library and Professional Activities," and "Information Communication Technology (ICT)." It was found that the majority of the posts belonged to "LIS Exam," followed by "Other Discussions" and "General Information related to the Library."

Patent Technology Trends of Oral Health: Application of Text Mining

  • Hee-Kyeong Bak;Yong-Hwan Kim;Han-Na Kim
    • Journal of dental hygiene science
    • /
    • v.24 no.1
    • /
    • pp.9-21
    • /
    • 2024
  • Background: The purpose of this study was to utilize text network analysis and topic modeling to identify interconnected relationships among keywords present in patent information related to oral health, and subsequently extract latent topics and visualize them. By examining key keywords and specific subjects, this study sought to comprehend the technological trends in oral health-related innovations. Furthermore, it aims to serve as foundational material, suggesting directions for technological advancement in dentistry and dental hygiene. Methods: The data utilized in this study consisted of information registered over a 20-year period until July 31st, 2023, obtained from the patent information retrieval service, KIPRIS. A total of 6,865 patent titles related to keywords, such as "dentistry," "teeth," and "oral health," were collected through the searches. The research tools included a custom-designed program coded specifically for the research objectives based on Python 3.10. This program was used for keyword frequency analysis, semantic network analysis, and implementation of Latent Dirichlet Allocation for topic modeling. Results: Upon analyzing the centrality of connections among the top 50 frequently occurring words, "method," "tooth," and "manufacturing" displayed the highest centrality, while "active ingredient" had the lowest. Regarding topic modeling outcomes, the "implant" topic constituted the largest share at 22.0%, while topics concerning "devices and materials for oral health" and "toothbrushes and oral care" exhibited the lowest proportions at 5.5% each. Conclusion: Technologies concerning methods and implants are continually being researched in patents related to oral health, while there is comparatively less technological development in devices and materials for oral health. This study is expected to be a valuable resource for uncovering potential themes from a large volume of patent titles and suggesting research directions.