• Title/Summary/Keyword: semantic mining

Search Result 213, Processing Time 0.027 seconds

Systematic Review on Chatbot Techniques and Applications

  • Park, Dong-Min;Jeong, Seong-Soo;Seo, Yeong-Seok
    • Journal of Information Processing Systems
    • /
    • v.18 no.1
    • /
    • pp.26-47
    • /
    • 2022
  • Chatbots were an important research subject in the past. A chatbot is a computer program or an artificial intelligence program that participates in a conversation via auditory or textual methods. As the research on chatbots progressed, some important issues regarding them changed over time. Therefore, it is necessary to review the technology with a focus on recent advancements and core research technologies. In this paper, we introduce five different chatbot technologies: natural language processing, pattern matching, semantic web, data mining, and context-aware computer. We also introduce the latest technology for the chatbot researchers to recognize the present situation and channelize it in the right direction.

Text Mining Analysis of the Online Counseling Contents of Nursery School Teachers (텍스트 마이닝을 활용한 어린이집교사 온라인 상담의 내용분석)

  • Jeon, Ji Won;Lim, Sun Ah;Jung, Yunhee
    • Korean Journal of Childcare and Education
    • /
    • v.16 no.6
    • /
    • pp.253-272
    • /
    • 2020
  • Objective: This study aimed to analyze the counseling contents of daycare center teachers by using text mining and semantic network analysis methods to find the necessary support directions for daycare teachers and to improve the quality of child-care. Methods: Five hundred thirteen cases of counseling recorded on the open bulletin board of online counseling (Naver Bands for Nursery Teacher Counseling) were collected, and frequency analysis, centrality solidarity analysis, and machine learning-based topic analysis were conducted using the NetMiner4.3 program. Results: First, 'teacher-to-child ratio' was highest in the frequency. Second, 'colleagues' were all high in all centrality analysis. Third, machine learning-based topical analysis shows that the topics were categorized as subjects about 'childcare and education', 'working environment that supports professional development' and 'working condition', and among them, 'first-time teacher concerns' accounted for 44% of the total counseling content. Conclusion/Implications: This study implied that it is necessary to provide high-quality child-care and education to infants by lowering the 'teacher-to-child ratio', and a systematic program is needed to help improve effective communication skills in interpersonal relationships such as between parents, fellow teachers, and principals. In addition, self-development and efforts to improve teachers expertise should be prioritized in order to improve infant care quality and quality of teachers.

MSFM: Multi-view Semantic Feature Fusion Model for Chinese Named Entity Recognition

  • Liu, Jingxin;Cheng, Jieren;Peng, Xin;Zhao, Zeli;Tang, Xiangyan;Sheng, Victor S.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.6
    • /
    • pp.1833-1848
    • /
    • 2022
  • Named entity recognition (NER) is an important basic task in the field of Natural Language Processing (NLP). Recently deep learning approaches by extracting word segmentation or character features have been proved to be effective for Chinese Named Entity Recognition (CNER). However, since this method of extracting features only focuses on extracting some of the features, it lacks textual information mining from multiple perspectives and dimensions, resulting in the model not being able to fully capture semantic features. To tackle this problem, we propose a novel Multi-view Semantic Feature Fusion Model (MSFM). The proposed model mainly consists of two core components, that is, Multi-view Semantic Feature Fusion Embedding Module (MFEM) and Multi-head Self-Attention Mechanism Module (MSAM). Specifically, the MFEM extracts character features, word boundary features, radical features, and pinyin features of Chinese characters. The acquired font shape, font sound, and font meaning features are fused to enhance the semantic information of Chinese characters with different granularities. Moreover, the MSAM is used to capture the dependencies between characters in a multi-dimensional subspace to better understand the semantic features of the context. Extensive experimental results on four benchmark datasets show that our method improves the overall performance of the CNER model.

Ontology and Text Mining-based Advanced Historical People Finding Service (온톨로지와 텍스트 마이닝 기반 지능형 역사인물 검색 서비스)

  • Jeong, Do-Heon;Hwang, Myunggwon;Cho, Minhee;Jung, Hanmin;Yoon, Soyoung;Kim, Kyungsun;Kim, Pyung
    • Journal of Internet Computing and Services
    • /
    • v.13 no.5
    • /
    • pp.33-43
    • /
    • 2012
  • Semantic web is utilized to construct advanced information service by using semantic relationships between entities. Text mining can be applied to generate semantic relationships from unstructured data resources. In this study, ontology schema guideline, ontology instance generation, disambiguation of same name by text mining and advanced historical people finding service by reasoning have been proposed. Various relationships between historical event, organization, people, which are created by domain experts, are linked to literatures of National Institute of Korean History (NIKH). It improves the effectiveness of user access and proposes advanced people finding service based on relationships. In order to distinguish between people with the same name, we compares the structure and edge, nodes of personal social network. To provide additional information, external resources including thesaurus and web are linked to all of internal related resources as well.

A Study of Consumer Perception on Fashion Show Using Big Data Analysis (빅데이터를 활용한 패션쇼에 대한 소비자 인식 연구)

  • Kim, Da Jeong;Lee, Seunghee
    • Journal of Fashion Business
    • /
    • v.23 no.3
    • /
    • pp.85-100
    • /
    • 2019
  • This study examines changes in consumer perceptions of fashion shows, which are critical elements in the apparel industry and a means to represent a brand's image and originality. For this purpose, big data in clothing marketing, text mining, semantic network analysis techniques were applied. This study aims to verify the effectiveness and significance of fashion shows in an effort to give directions for their future utilization. The study was conducted in two major stages. First, data collection with the key word, "fashion shows," was conducted across websites, including Naver and Daum between 2015 and 2018. The data collection period was divided into the first- and second-half periods. Next, Textom 3.0 was utilized for data refinement, text mining, and word clouding. The Ucinet 6.0 and NetDraw, were used for semantic network analysis, degree centrality, CONCOR analysis and also visualization. The level of interest in "models" was found to be the highest among the perception factors related to fashion shows in both periods. In the first-half period, the consumer interests focused on detailed visual stimulants such as model and clothing while in the second-half period, perceptions changed as the value of designers and brands were increasingly recognized over time. The findings of this study can be utilized as a tool to evaluate fashion shows, the apparel industry sectors, and the marketing methods. Additionally, it can also be used as a theoretical framework for big data analysis and as a basis of strategies and research in industrial developments.

Social perception of the Arduino lecture as seen in big data (빅데이터 분석을 통한 아두이노 강의에 대한 사회적 인식)

  • Lee, Eunsang
    • Journal of The Korean Association of Information Education
    • /
    • v.25 no.6
    • /
    • pp.935-945
    • /
    • 2021
  • The purpose of this study is to analyze the social perception of Arduino lecture using big data analysis method. For this purpose, data from January 2012 to May 2021 were collected using the Textom website as a keyword searched for 'arduino + lecture' in blogs, cafes, and news channels of NAVER website. The collected data was refined using the Textom website, and text mining analysis and semantic network analysis were performed by opening the Textom website, Ucinet 6, and Netdraw programs. As a result of text mining analysis such as frequency analysis, TF-IDF analysis, and degree centrality it was confirmed that 'education' and 'coding' were the top keywords. As a result of CONCOR analysis for semantic network analysis, four clusters can be identified: 'Arduino-related education', 'Physical computing-related lecture', 'Arduino special lecture', and 'GUI programming'. Through this study, it was possible to confirm various meaningful social perceptions of the general public in relation to Arduino lecture on the Internet. The results of this study will be used as data that provides meaningful implications for instructors preparing for Arduino lectures, researchers studying the subject, and policy makers who establish software education or coding education and related policies.

Big Data Analysis on the Perception of Home Training According to the Implementation of COVID-19 Social Distancing

  • Hyun-Chang Keum;Kyung-Won Byun
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.15 no.3
    • /
    • pp.211-218
    • /
    • 2023
  • Due to the implementation of COVID-19 distancing, interest and users in 'home training' are rapidly increasing. Therefore, the purpose of this study is to identify the perception of 'home training' through big data analysis on social media channels and provide basic data to related business sector. Social media channels collected big data from various news and social content provided on Naver and Google sites. Data for three years from March 22, 2020 were collected based on the time when COVID-19 distancing was implemented in Korea. The collected data included 4,000 Naver blogs, 2,673 news, 4,000 cafes, 3,989 knowledge IN, and 953 Google channel news. These data analyzed TF and TF-IDF through text mining, and through this, semantic network analysis was conducted on 70 keywords, big data analysis programs such as Textom and Ucinet were used for social big data analysis, and NetDraw was used for visualization. As a result of text mining analysis, 'home training' was found the most frequently in relation to TF with 4,045 times. The next order is 'exercise', 'Homt', 'house', 'apparatus', 'recommendation', and 'diet'. Regarding TF-IDF, the main keywords are 'exercise', 'apparatus', 'home', 'house', 'diet', 'recommendation', and 'mat'. Based on these results, 70 keywords with high frequency were extracted, and then semantic indicators and centrality analysis were conducted. Finally, through CONCOR analysis, it was clustered into 'purchase cluster', 'equipment cluster', 'diet cluster', and 'execute method cluster'. For the results of these four clusters, basic data on the 'home training' business sector were presented based on consumers' main perception of 'home training' and analysis of the meaning network.

A Study on the User Experience at Unmanned Cafe Using Big Data Analsis: Focus on text mining and semantic network analysis (빅데이터를 활용한 무인카페 소비자 인식에 관한 연구: 텍스트 마이닝과 의미연결망 분석을 중심으로)

  • Seung-Yeop Lee;Byeong-Hyeon Park;Jang-Hyeon Nam
    • Asia-Pacific Journal of Business
    • /
    • v.14 no.3
    • /
    • pp.241-250
    • /
    • 2023
  • Purpose - The purpose of this study was to investigate the perception of 'unmanned cafes' on the network through big data analysis, and to identify the latest trends in rapidly changing consumer perception. Based on this, I would like to suggest that it can be used as basic data for the revitalization of unmanned cafes and differentiated marketing strategies. Design/methodology/approach - This study collected documents containing unmanned cafe keywords for about three years, and the data collected using text mining techniques were analyzed using methods such as keyword frequency analysis, centrality analysis, and keyword network analysis. Findings - First, the top 10 words with a high frequency of appearance were identified in the order of unmanned cafes, unmanned cafes, start-up, operation, coffee, time, coffee machine, franchise, and robot cafes. Second, visualization of the semantic network confirmed that the key keyword "unmanned cafe" was at the center of the keyword cluster. Research implications or Originality - Using big data to collect and analyze keywords with high web visibility, we tried to identify new issues or trends in unmanned cafe recognition, which consists of keywords related to start-ups, mainly deals with topics related to start-ups when unmanned cafes are mentioned on the network.

A Distributed Domain Document Object Management using Semantic Reference Relationship (SRR을 이용한 분산 도메인 문서 객체 관리)

  • Lee, Chong-Deuk
    • Journal of Digital Convergence
    • /
    • v.10 no.5
    • /
    • pp.267-273
    • /
    • 2012
  • The semantic relationship structures hierarchically the huge amount of document objects which is usually not formatted. However, it is very difficult to structure relevant data from various distributed application domains. This paper proposed a new object management method to service the distributed domain objects by using semantic reference relationship. The proposed mechanism utilized the profile structure in order to extract the semantic similarity from application domain objects and utilized the joint matrix to decide the semantic relationship of the extracted objects. This paper performed the simulation to show the performance of the proposed method, and simulation results show that the proposed method has better retrieval performance than the existing text mining method and information extraction method.

Discovering Meaningful Trends in the Inaugural Addresses of North Korean Leader Via Text Mining (텍스트마이닝을 활용한 북한 지도자의 신년사 및 연설문 트렌드 연구)

  • Park, Chul-Soo
    • Journal of Information Technology Applications and Management
    • /
    • v.26 no.3
    • /
    • pp.43-59
    • /
    • 2019
  • The goal of this paper is to investigate changes in North Korea's domestic and foreign policies through automated text analysis over North Korean new year addresses, one of most important and authoritative document publicly announced by North Korean government. Based on that data, we then analyze the status of text mining research, using a text mining technique to find the topics, methods, and trends of text mining research. We also investigate the characteristics and method of analysis of the text mining techniques, confirmed by analysis of the data. We propose a procedure to find meaningful tendencies based on a combination of text mining, cluster analysis, and co-occurrence networks. To demonstrate applicability and effectiveness of the proposed procedure, we analyzed the inaugural addresses of Kim Jung Un of the North Korea from 2017 to 2019. The main results of this study show that trends in the North Korean national policy agenda can be discovered based on clustering and visualization algorithms. We found that uncovered semantic structures of North Korean new year addresses closely follow major changes in North Korean government's positions toward their own people as well as outside audience such as USA and South Korea.