• Title/Summary/Keyword: Text data

Search Result 2,956, Processing Time 0.032 seconds

Using similarity based image caption to aid visual question answering (유사도 기반 이미지 캡션을 이용한 시각질의응답 연구)

  • Kang, Joonseo;Lim, Changwon
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.2
    • /
    • pp.191-204
    • /
    • 2021
  • Visual Question Answering (VQA) and image captioning are tasks that require understanding of the features of images and linguistic features of text. Therefore, co-attention may be the key to both tasks, which can connect image and text. In this paper, we propose a model to achieve high performance for VQA by image caption generated using a pretrained standard transformer model based on MSCOCO dataset. Captions unrelated to the question can rather interfere with answering, so some captions similar to the question were selected to use based on a similarity to the question. In addition, stopwords in the caption could not affect or interfere with answering, so the experiment was conducted after removing stopwords. Experiments were conducted on VQA-v2 data to compare the proposed model with the deep modular co-attention network (MCAN) model, which showed good performance by using co-attention between images and text. As a result, the proposed model outperformed the MCAN model.

Major concerns regarding food services based on news media reports during the COVID-19 outbreak using the topic modeling approach

  • Yoon, Hyejin;Kim, Taejin;Kim, Chang-Sik;Kim, Namgyu
    • Nutrition Research and Practice
    • /
    • v.15 no.sup1
    • /
    • pp.110-121
    • /
    • 2021
  • BACKGROUND/OBJECTIVES: Coronavirus disease 2019 (COVID-19) cases were first reported in December 2019, in China, and an increasing number of cases have since been detected all over the world. The purpose of this study was to collect significant news media reports on food services during the COVID-19 crisis and identify public communication and significant concerns regarding COVID-19 for suggesting future directions for the food industry and services. SUBJECTS/METHODS: News articles pertaining to food services were extracted from the home pages of major news media websites such as BBC, CNN, and Fox News between March 2020 and February 2021. The retrieved data was sorted and analyzed using Python software. RESULTS: The results of text analytics were presented in the format of the topic label and category for individual topics. The food and health category presented the effects of the COVID-19 pandemic on food and health, such as an increase in delivery services. The policy category was indicative of a change in government policy. The lifestyle change category addressed topics such as an increase in social media usage. CONCLUSIONS: This study is the first to analyze major news media (i.e., BBC, CNN, and Fox News) data related to food services in the context of the COVID-19 pandemic. Text analytics research on the food services domain revealed different categories such as food and health, policy, and lifestyle change. Therefore, this study contributes to the body of knowledge on food services research, through the use of text analytics to elicit findings from media sources.

A Study on the Finding of Promising Export Items in Defense industry for Export Market Expansion-Focusing on Text Mining Analysis-

  • Yeo, Seoyoon;Jeong, Jong Hee;Kim, Seong Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.10
    • /
    • pp.235-243
    • /
    • 2022
  • This paper aims to find promising export items for market expansion of defense export items. Germany, the UK, and France were selected as export target countries to obtain unstructured forecast data on weapons system acquisition plans for the next ten years by each country. Using the TF-IDF in text mining analysis, keywords that appeared frequently in data from three countries were derived. As a result of this paper, keywords for each country's major acquisition projects drawing. However, most of the derived keywords were related to mainstay weapon systems produced by domestic defense companies in each country. To discover promising export items from text mining, we proposed that the drawn keywords are distinguished as similar weapon systems. In addition, we assort the weapon systems that the three countries will get a plan to acquire commonly. As a result of this paper, it can be seen that the current promising export item is a weapon system related to the information system. Prioritizing overseas demands using key words can set clear market entry goals. In the case of domestic companies based on needs, it is possible to establish a specific entry strategy. Relevant organizations also can provide customized marketing support.

Using Text Mining and Social Network Analysis to Identify Determinant Characteristics Affecting Consumers' Evaluation of Clothing Fit (텍스트 마이닝과 소셜 네트워크 분석 기법을 활용한 소비자의 의복 맞음새(Fit)평가에 영향을 미치는 특성)

  • Soo Hyun Hwang;Juyeon Park
    • Science of Emotion and Sensibility
    • /
    • v.26 no.1
    • /
    • pp.101-114
    • /
    • 2023
  • This research aimed to recognize the determinant characteristics affecting consumers' clothing fit evaluation by employing text mining and social network analysis. For this aim, we first extracted text data linked to clothing fit from 2,000 consumer reviews collected from social network services and conducted semantic network examination and CONCOR analysis. As a result, we reported that "pants" and "skirts" were the most commonly associated clothing items with consumers' clothing fit evaluation. And the length of clothing was most commonly investigated. Then, the "waist" and "hip" were the most critical body parts affecting consumers' perception of clothing fit. Further, the four keywords including "wide," "large," "short," and "long" were the most employed ones in consumer reviews when evaluating clothing fit. This study is meaningful in that it specifically recognized the structural relationship and semantic meanings of keywords relevant to consumers' evaluation of clothing fit, which could bring empirical reference information for advanced clothing fit.

Analysis on the Trends of Research Themes of the Korean Dance Using Text Mining (텍스트 마이닝을 활용한 한국무용 연구주제 동향 분석)

  • Kim, Woo-Kyung;Yoo, Ji-Young
    • Journal of Korea Entertainment Industry Association
    • /
    • v.13 no.5
    • /
    • pp.215-228
    • /
    • 2019
  • The purpose of this study is to analyze the trends of research themes of the Korean dance in recent 20 years using text mining. The study has analyzed 3,047 words in 1,468 academic papers posted in the Research & Information Services Section(RISS). TEXTOM, a big data analysis solution, has been used to refine and analyse data, and the keyword analysis and topic modeling have been adopted during the text-mining process to come up with meaningful results. First, the theme of studies has shifted from the structure of the basic Korean dance moves to the use and transmission of the Korean dance. Second, those who participate in studies of the Korean dance have changed from middle-aged women to elderly women. Third, studies on dance records have been inactivated. Fourth, studies on Choi Seung-hee have consistently been a subject of interest. Fifth, the focus of studies has turned from the Korean creative dance to the Korean traditional dance. Sixth, there are no iconic research themes that would lead the academic trends with no clear boundaries of research themes.

A Design and Implementation of The Deep Learning-Based Senior Care Service Application Using AI Speaker

  • Mun Seop Yun;Sang Hyuk Yoon;Ki Won Lee;Se Hoon Kim;Min Woo Lee;Ho-Young Kwak;Won Joo Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.4
    • /
    • pp.23-30
    • /
    • 2024
  • In this paper, we propose a deep learning-based personalized senior care service application. The proposed application uses Speech to Text technology to convert the user's speech into text and uses it as input to Autogen, an interactive multi-agent large-scale language model developed by Microsoft, for user convenience. Autogen uses data from previous conversations between the senior and ChatBot to understand the other user's intent and respond to the response, and then uses a back-end agent to create a wish list, a shared calendar, and a greeting message with the other user's voice through a deep learning model for voice cloning. Additionally, the application can perform home IoT services with SKT's AI speaker (NUGU). The proposed application is expected to contribute to future AI-based senior care technology.

Implementation of 3-D Data Viewing System

  • Li, Jiangtao;Lee, Hyo-Jong
    • Proceedings of the IEEK Conference
    • /
    • 2008.06a
    • /
    • pp.749-750
    • /
    • 2008
  • It is often required to display 3-D data onto a 2-D screen and to examine and verify validity of data. LIDAR data is a good example. They represent 3-D spatial information in text format. However, it is very difficult to examine data on a 2-D screen. A 3-D data viewing system has been implemented and tested in order to solve the problem.

  • PDF

Visualization of XML Object (XML오브젝트의 가시화)

  • 김혜연;조동섭
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.10b
    • /
    • pp.272-274
    • /
    • 2000
  • XML data를 VRML을 사용하여 시각적으로 나타내는 방법을 연구하였다. 현재 Web 환경은 동적으로 문서를 생성하고 사용자가 보기 쉽게 그래픽으로 표현하는 방향으로 발전하고 있다. 이러한 환경에서 XML은 실시간으로 data를 생성하기 쉬워 많이 사용되고 있으나 text 기반이기 때문에 data를 가시화하여 사용자한테 보여주기 힘들다는 단점이 있다. 이에 VRML을 XML과 결합하여 실시간으로 변화는 data를 VRML과 같은 시각화 도구를 사용하여 표현하는 방법에 대해 연구를 하였다. 본 논문에서는 Java Servlet을 사용하여 XML 문서에서 data를 추출하여 VRML 코드를 만들고, 그 코드를 사용자측에 전달하여 시각적으로 data를 볼 수 있도록 하였다.

  • PDF

Bio-Sensing Convergence Big Data Computing Architecture (바이오센싱 융합 빅데이터 컴퓨팅 아키텍처)

  • Ko, Myung-Sook;Lee, Tae-Gyu
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.2
    • /
    • pp.43-50
    • /
    • 2018
  • Biometric information computing is greatly influencing both a computing system and Big-data system based on the bio-information system that combines bio-signal sensors and bio-information processing. Unlike conventional data formats such as text, images, and videos, biometric information is represented by text-based values that give meaning to a bio-signal, important event moments are stored in an image format, a complex data format such as a video format is constructed for data prediction and analysis through time series analysis. Such a complex data structure may be separately requested by text, image, video format depending on characteristics of data required by individual biometric information application services, or may request complex data formats simultaneously depending on the situation. Since previous bio-information processing computing systems depend on conventional computing component, computing structure, and data processing method, they have many inefficiencies in terms of data processing performance, transmission capability, storage efficiency, and system safety. In this study, we propose an improved biosensing converged big data computing architecture to build a platform that supports biometric information processing computing effectively. The proposed architecture effectively supports data storage and transmission efficiency, computing performance, and system stability. And, it can lay the foundation for system implementation and biometric information service optimization optimized for future biometric information computing.

Development of Sentiment Analysis Model for the hot topic detection of online stock forums (온라인 주식 포럼의 핫토픽 탐지를 위한 감성분석 모형의 개발)

  • Hong, Taeho;Lee, Taewon;Li, Jingjing
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.187-204
    • /
    • 2016
  • Document classification based on emotional polarity has become a welcomed emerging task owing to the great explosion of data on the Web. In the big data age, there are too many information sources to refer to when making decisions. For example, when considering travel to a city, a person may search reviews from a search engine such as Google or social networking services (SNSs) such as blogs, Twitter, and Facebook. The emotional polarity of positive and negative reviews helps a user decide on whether or not to make a trip. Sentiment analysis of customer reviews has become an important research topic as datamining technology is widely accepted for text mining of the Web. Sentiment analysis has been used to classify documents through machine learning techniques, such as the decision tree, neural networks, and support vector machines (SVMs). is used to determine the attitude, position, and sensibility of people who write articles about various topics that are published on the Web. Regardless of the polarity of customer reviews, emotional reviews are very helpful materials for analyzing the opinions of customers through their reviews. Sentiment analysis helps with understanding what customers really want instantly through the help of automated text mining techniques. Sensitivity analysis utilizes text mining techniques on text on the Web to extract subjective information in the text for text analysis. Sensitivity analysis is utilized to determine the attitudes or positions of the person who wrote the article and presented their opinion about a particular topic. In this study, we developed a model that selects a hot topic from user posts at China's online stock forum by using the k-means algorithm and self-organizing map (SOM). In addition, we developed a detecting model to predict a hot topic by using machine learning techniques such as logit, the decision tree, and SVM. We employed sensitivity analysis to develop our model for the selection and detection of hot topics from China's online stock forum. The sensitivity analysis calculates a sentimental value from a document based on contrast and classification according to the polarity sentimental dictionary (positive or negative). The online stock forum was an attractive site because of its information about stock investment. Users post numerous texts about stock movement by analyzing the market according to government policy announcements, market reports, reports from research institutes on the economy, and even rumors. We divided the online forum's topics into 21 categories to utilize sentiment analysis. One hundred forty-four topics were selected among 21 categories at online forums about stock. The posts were crawled to build a positive and negative text database. We ultimately obtained 21,141 posts on 88 topics by preprocessing the text from March 2013 to February 2015. The interest index was defined to select the hot topics, and the k-means algorithm and SOM presented equivalent results with this data. We developed a decision tree model to detect hot topics with three algorithms: CHAID, CART, and C4.5. The results of CHAID were subpar compared to the others. We also employed SVM to detect the hot topics from negative data. The SVM models were trained with the radial basis function (RBF) kernel function by a grid search to detect the hot topics. The detection of hot topics by using sentiment analysis provides the latest trends and hot topics in the stock forum for investors so that they no longer need to search the vast amounts of information on the Web. Our proposed model is also helpful to rapidly determine customers' signals or attitudes towards government policy and firms' products and services.