• Title/Summary/Keyword: SNS Big Data

Search Result 230, Processing Time 0.024 seconds

An Efficient Damage Information Extraction from Government Disaster Reports

  • Shin, Sungho;Hong, Seungkyun;Song, Sa-Kwang
    • Journal of Internet Computing and Services
    • /
    • v.18 no.6
    • /
    • pp.55-63
    • /
    • 2017
  • One of the purposes of Information Technology (IT) is to support human response to natural and social problems such as natural disasters and spread of disease, and to improve the quality of human life. Recent climate change has happened worldwide, natural disasters threaten the quality of life, and human safety is no longer guaranteed. IT must be able to support tasks related to disaster response, and more importantly, it should be used to predict and minimize future damage. In South Korea, the data related to the damage is checked out by each local government and then federal government aggregates it. This data is included in disaster reports that the federal government discloses by disaster case, but it is difficult to obtain raw data of the damage even for research purposes. In order to obtain data, information extraction may be applied to disaster reports. In the field of information extraction, most of the extraction targets are web documents, commercial reports, SNS text, and so on. There is little research on information extraction for government disaster reports. They are mostly text, but the structure of each sentence is very different from that of news articles and commercial reports. The features of the government disaster report should be carefully considered. In this paper, information extraction method for South Korea government reports in the word format is presented. This method is based on patterns and dictionaries and provides some additional ideas for tokenizing the damage representation of the text. The experiment result is F1 score of 80.2 on the test set. This is close to cutting-edge information extraction performance before applying the recent deep learning algorithms.

A Study on WT-Algorithm for Effective Reduction of Association Rules (효율적인 연관규칙 감축을 위한 WT-알고리즘에 관한 연구)

  • Park, Jin-Hee;Pi, Su-Young
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.20 no.5
    • /
    • pp.61-69
    • /
    • 2015
  • We are in overload status of information not just in a flood of information due to the data pouring from various kinds of mobile devices, online and Social Network Service(SNS) every day. While there are many existing information already created, lots of new information has been created from moment to moment. Linkage analysis has the shortcoming in that it is difficult to find the information we want since the number of rules increases geometrically as the number of item increases with the method of finding out frequent item set where the frequency of item is bigger than minimum support in this information. In this regard, this thesis proposes WT-algorithm that represents the transaction data set as Boolean variable item and grants weight to each item by making algorithm with Quine-McKluskey used to simplify the logical function. The proposed algorithm can improve efficiency of data mining by reducing the unnecessary rules due to the advantage of simplification regardless of number of items.

Word-of-Mouth Effect for Online Sales of K-Beauty Products: Centered on China SINA Weibo and Meipai (K-Beauty 구전효과가 온라인 매출액에 미치는 영향: 중국 SINA Weibo와 Meipai 중심으로)

  • Liu, Meina;Lim, Gyoo Gun
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.197-218
    • /
    • 2019
  • In addition to economic growth and national income increase, China is also experiencing rapid growth in consumption of cosmetics. About 67% of the total trade volume of Chinese cosmetics is made by e-commerce and especially K-Beauty products, which are Korean cosmetics are very popular. According to previous studies, 80% of consumer goods such as cosmetics are affected by the word of mouth information, searching the product information before purchase. Mostly, consumers acquire information related to cosmetics through comments made by other consumers on SNS such as SINA Weibo and Wechat, and recently they also use information about beauty related video channels. Most of the previous online word-of-mouth researches were mainly focused on media itself such as Facebook, Twitter, and blogs. However, the informational characteristics and the expression forms are also diverse. Typical types are text, picture, and video. This study focused on these types. We analyze the unstructured data of SINA Weibo, the SNS representative platform of China, and Meipai, the video platform, and analyze the impact of K-Beauty brand sales by dividing online word-of-mouth information with quantity and direction information. We analyzed about 330,000 data from Meipai, and 110,000 data from SINA Weibo and analyzed the basic properties of cosmetics. As a result of analysis, the amount of online word-of-mouth information has a positive effect on the sales of cosmetics irrespective of the type of media. However, the online videos showed higher impacts than the pictures and texts. Therefore, it is more effective for companies to carry out advertising and promotional activities in parallel with the existing SNS as well as video related information. It is understood that it is important to generate the frequency of exposure irrespective of media type. The positiveness of the video media was significant but the positiveness of the picture and text media was not significant. Due to the nature of information types, the amount of information in video media is more than that in text-oriented media, and video-related channels are emerging all over the world. In particular, China has made a number of video platforms in recent years and has enjoyed popularity among teenagers and thirties. As a result, existing SNS users are being dispersed to video media. We also analyzed the effect of online type of information on the online cosmetics sales by dividing the product type of cosmetics into basic cosmetics and color cosmetics. As a result, basic cosmetics had a positive effect on the sales according to the number of online videos and it was affected by the negative information of the videos. In the case of basic cosmetics, effects or characteristics do not appear immediately like color cosmetics, so information such as changes after use is often transmitted over a period of time. Therefore, it is important for companies to move more quickly to issues generated from video media. Color cosmetics are largely influenced by negative oral statements and sensitive to picture and text-oriented media. Information such as picture and text has the advantage and disadvantage that the process of making it can be made easier than video. Therefore, complaints and opinions are generally expressed in SNS quickly and immediately. Finally, we analyzed how product diversity affects sales according to online word of mouth information type. As a result of the analysis, it can be confirmed that when a variety of products are introduced in a video channel, they have a positive effect on online cosmetics sales. The significance of this study in the theoretical aspect is that, as in the previous studies, online sales have basically proved that K-Beauty cosmetics are also influenced by word-of-mouth. However this study focused on media types and both media have a positive impact on sales, as in previous studies, but it has been proven that video is more informative and influencing than text, depending on media abundance. In addition, according to the existing research on information direction, it is said that the negative influence has more influence, but in the basic study, the correlation is not significant, but the effect of negation in the case of color cosmetics is large. In the case of temporal fashion products such as color cosmetics, fast oral effect is influenced. In practical terms, it is expected that it will be helpful to use advertising strategies on the sales and advertising strategy of K-Beauty cosmetics in China by distinguishing basic and color cosmetics. In addition, it can be said that it recognized the importance of a video advertising strategy such as YouTube and one-person media. The results of this study can be used as basic data for analyzing the big data in understanding the Chinese cosmetics market and establishing appropriate strategies and marketing utilization of related companies.

A Study on the Revitalization of Tourism Industry through Big Data Analysis (한국관광 실태조사 빅 데이터 분석을 통한 관광산업 활성화 방안 연구)

  • Lee, Jungmi;Liu, Meina;Lim, Gyoo Gun
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.149-169
    • /
    • 2018
  • Korea is currently accumulating a large amount of data in public institutions based on the public data open policy and the "Government 3.0". Especially, a lot of data is accumulated in the tourism field. However, the academic discussions utilizing the tourism data are still limited. Moreover, the openness of the data of restaurants, hotels, and online tourism information, and how to use SNS Big Data in tourism are still limited. Therefore, utilization through tourism big data analysis is still low. In this paper, we tried to analyze influencing factors on foreign tourists' satisfaction in Korea through numerical data using data mining technique and R programming technique. In this study, we tried to find ways to revitalize the tourism industry by analyzing about 36,000 big data of the "Survey on the actual situation of foreign tourists from 2013 to 2015" surveyed by the Korea Culture & Tourism Research Institute. To do this, we analyzed the factors that have high influence on the 'Satisfaction', 'Revisit intention', and 'Recommendation' variables of foreign tourists. Furthermore, we analyzed the practical influences of the variables that are mentioned above. As a procedure of this study, we first integrated survey data of foreign tourists conducted by Korea Culture & Tourism Research Institute, which is stored in the tourist information system from 2013 to 2015, and eliminate unnecessary variables that are inconsistent with the research purpose among the integrated data. Some variables were modified to improve the accuracy of the analysis. And we analyzed the factors affecting the dependent variables by using data-mining methods: decision tree(C5.0, CART, CHAID, QUEST), artificial neural network, and logistic regression analysis of SPSS IBM Modeler 16.0. The seven variables that have the greatest effect on each dependent variable were derived. As a result of data analysis, it was found that seven major variables influencing 'overall satisfaction' were sightseeing spot attraction, food satisfaction, accommodation satisfaction, traffic satisfaction, guide service satisfaction, number of visiting places, and country. Variables that had a great influence appeared food satisfaction and sightseeing spot attraction. The seven variables that had the greatest influence on 'revisit intention' were the country, travel motivation, activity, food satisfaction, best activity, guide service satisfaction and sightseeing spot attraction. The most influential variables were food satisfaction and travel motivation for Korean style. Lastly, the seven variables that have the greatest influence on the 'recommendation intention' were the country, sightseeing spot attraction, number of visiting places, food satisfaction, activity, tour guide service satisfaction and cost. And then the variables that had the greatest influence were the country, sightseeing spot attraction, and food satisfaction. In addition, in order to grasp the influence of each independent variables more deeply, we used R programming to identify the influence of independent variables. As a result, it was found that the food satisfaction and sightseeing spot attraction were higher than other variables in overall satisfaction and had a greater effect than other influential variables. Revisit intention had a higher ${\beta}$ value in the travel motive as the purpose of Korean Wave than other variables. It will be necessary to have a policy that will lead to a substantial revisit of tourists by enhancing tourist attractions for the purpose of Korean Wave. Lastly, the recommendation had the same result of satisfaction as the sightseeing spot attraction and food satisfaction have higher ${\beta}$ value than other variables. From this analysis, we found that 'food satisfaction' and 'sightseeing spot attraction' variables were the common factors to influence three dependent variables that are mentioned above('Overall satisfaction', 'Revisit intention' and 'Recommendation'), and that those factors affected the satisfaction of travel in Korea significantly. The purpose of this study is to examine how to activate foreign tourists in Korea through big data analysis. It is expected to be used as basic data for analyzing tourism data and establishing effective tourism policy. It is expected to be used as a material to establish an activation plan that can contribute to tourism development in Korea in the future.

Analysis of the Landscape Characteristics of Island Tourist Site Using Big Data - Based on Bakji and Banwol-do, Shinan-gun - (빅데이터를 활용한 섬 관광지의 경관 특성 분석 - 신안군 박지·반월도를 대상으로 -)

  • Do, Jee-Yoon;Suh, Joo-Hwan
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.49 no.2
    • /
    • pp.61-73
    • /
    • 2021
  • This study aimed to identify the landscape perception and landscape characteristics of users by utilizing SNS data generated by their experiences. Therefore, how to recognize the main places and scenery appearing on the island, and what are the characteristics of the main scenery were analyzed using online text data and photo data. Text data are text mining and network structural analysis, while photographic data are landscape identification models and color analysis. As a result of the study, First, as a result of frequency analysis of Bakji·Banwol-do topics, we were able to derive keywords for local landscapes such as 'Purple Bridge', 'Doori Village', and location, behavior, and landscape images by analyzing them simultaneously. Second, the network structure analysis showed that the connection between key and undrawn keywords could be more specifically analyzed, indicating that creating landscapes using colors is affecting regional activation. Third, after analyzing the landscape identification model, it was found that artificial elements would be excluded to create preferred landscapes using the main targets of "Purple Bridge" and "Doori Village", and that it would be effective to set a view point of the sea and sky. Fourth, Bakji·Banwol-do were the first islands to be created under the theme of color, and the colors used in artificial facilities were similar to the surrounding environment, and were harmonized with contrasting lighting and saturation values. This study used online data uploaded directly by visitors in the landscape field to identify users' perceptions and objects of the landscape. Furthermore, the use of both text and photographic data to identify landscape recognition and characteristics is significant in that they can specifically identify which landscape and resources they prefer and perceive. In addition, the use of quantitative big data analysis and qualitative landscape identification models in identifying visitors' perceptions of local landscapes will help them understand the landscape more specifically through discussions based on results.

Social Media Bigdata Analysis Based on Information Security Keyword Using Text Mining (텍스트마이닝을 활용한 정보보호 키워드 기반 소셜미디어 빅데이터 분석)

  • Chung, JinMyeong;Park, YoungHo
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.5
    • /
    • pp.37-48
    • /
    • 2022
  • With development of Digital Technology, social issues are communicated through digital-based platform such as SNS and form public opinion. This study attempted to analyze big data from Twitter, a world-renowned social network service, and find out the public opinion. After collecting Twitter data based on 14 keywords for 1 year in 2021, analyzed the term-frequency and relationship among keyword documents with pearson correlation coefficient using Data-mining Technology. Furthermore, the 6 main topics that on the center of information security field in 2021 were derived through topic modeling using the LDA(Latent Dirichlet Allocation) technique. These results are expected to be used as basic data especially finding key agenda when establishing strategies for the next step related industries or establishing government policies.

User satisfaction analysis for layer-specific differences using the IoT services (IoT 서비스를 사용하는 사용자 계층별 차이에 대한 만족도 분석)

  • Park, Chong-Woon;Kwon, Chang-Hui
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.1
    • /
    • pp.90-98
    • /
    • 2017
  • Since 2010, SNS was holding the explosive spread of smartphones has created a place of public advertising platform, and it entered the Internet of Things (IoT) is born gradually countdown of the era came to us already. In utilizing a variety of location-based services IoT services (beacon, O2O) The focus of this paper to analyze the differences in satisfaction with the oil layer by experienced users of what is being used. We consider the type and overall utilization of the concepts typical IoT service in the current service is made to expand the contents of the paper. Hypothesis was reconstructed ease, attractiveness, a survey reliability, value four kinds of models, called Honeycomb UX, User Experience Honeycomp Peter Mobil. Company that provides the service IoT in this study are expected to be used as basic data to help provide a more accurate personalized service according to the user's satisfaction difference.

A Study on the Introductioin of Data Trusts System to Expand the Rights of Privacy Self-Determination (개인정보 자기결정권 확대를 위한 데이터 신탁제도 도입 방안 연구)

  • Jang, Keunjae;Lee, Seungyong
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.29-43
    • /
    • 2022
  • With the advent of the Internet and the development of mobile digital devices such as smartphones and tablet PCs, the communication service paradigm began to shift from existing voice services to data services. Recently, as social network services (SNS) are activated and 4th industrial revolution technologies centered on ICT (Information and Communication Technologies) such as Big Data, Blockchain, Cloud, and 5G/6G are rapidly developed, the amount of shared data type and the amount of data are increasing rapidly. As the transition to a digital society begins actively, the importance of using data information, as well as the economic and social values of personal information are becoming increasingly important. As a result, they are actively discussing policies to revitalize the data information industry around the world and ways to efficiently obtain, analyze, and utilize increasingly diverse and vast data, as well as to protect/guarantee the rights of information subjects (providers) in various fields such as society, culture, economy, and politics.. In this paper, in order to improve the self-determination right of personal information on data produced by information subjects, and further expand the use of safe data and the data economy, a differentiated data trusts system was considered and suggested. In addition, the components and data trusts procedures necessary to efficiently operate the data trusts system in Korea were considered, and the non-profit data trusts system and the for-profit data trusts system were considered as a way to flexibly operate the data trusts system. Furthermore, the legal items necessary for the implementation of the data trusts system were investigated and considered. In this paper, in order to propose a domestic data trusts system, cases related to existing data trusts systems such as the United States, Japan, and Korea were reviewed and analyzed. In addition, in order to prepare legislation necessary for the data trusts system, data-related laws in major countries and domestic legal and policy trends were reviewed to study the rights that conflict or overlap with existing laws, and differences were investigated and considered. The Data trusts system proposed in this paper is a reasonable system that is expected to recognize the asset value of data in the capitalist market economy system, to provide legitimate compensation for data produced by data subjects, and further to contribute greatly to the use of safe data and creation of a new service market.

Determinants of Mobile Application Use: A Study Focused on the Correlation between Application Categories (모바일 앱 사용에 영향을 미치는 요인에 관한 연구: 앱 카테고리 간 상관관계를 중심으로)

  • Park, Sangkyu;Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.157-176
    • /
    • 2016
  • For a long time, mobile phone had a sole function of communication. Recently however, abrupt innovations in technology allowed extension of the sphere in mobile phone activities. Development of technology enabled realization of almost computer-like environment even on a very small device. Such advancement yielded several forms of new high-tech devices such as smartphone and tablet PC, which quickly proliferated. Simultaneously with the diffusion of the mobile devices, mobile applications for those devices also prospered and soon became deeply penetrated in consumers' daily lives. Numerous mobile applications have been released in app stores yielding trillions of cumulative downloads. However, a big majority of the applications are disregarded from consumers. Even after the applications are purchased, they do not survive long in consumers' mobile devices and are soon abandoned. Nevertheless, it is imperative for both app developers and app-store operators to understand consumer behaviors and to develop marketing strategies aiming to make sustainable business by first increasing sales of mobile applications and by also designing surviving strategy for applications. Therefore, this research analyzes consumers' mobile application usage behavior in a frame of substitution/supplementary of application categories and several explanatory variables. Considering that consumers of mobile devices use multiple apps simultaneously, this research adopts multivariate probit models to explain mobile application usage behavior and to derive correlation between categories of applications for observing substitution/supplementary of application use. The research adopts several explanatory variables including sociodemographic data, user experiences of purchased applications that reflect future purchasing behavior of paid applications as well as consumer attitudes toward marketing efforts, variables representing consumer attitudes toward rating of the app and those representing consumer attitudes toward app-store promotion efforts (i.e., top developer badge and editor's choice badge). Results of this study can be explained in hedonic and utilitarian framework. Consumers who use hedonic applications, such as those of game and entertainment-related, are of young age with low education level. However, consumers who are old and have received higher education level prefer utilitarian application category such as life, information etc. There are disputable arguments over whether the users of SNS are hedonic or utilitarian. In our results, consumers who are younger and those with higher education level prefer using SNS category applications, which is in a middle of utilitarian and hedonic results. Also, applications that are directly related to tangible assets, such as banking, stock and mobile shopping, are only negatively related to experience of purchasing of paid app, meaning that consumers who put weights on tangible assets do not prefer buying paid application. Regarding categories, most correlations among categories are significantly positive. This is because someone who spend more time on mobile devices tends to use more applications. Game and entertainment category shows significant and positive correlation; however, there exists significantly negative correlation between game and information, as well as game and e-commerce categories of applications. Meanwhile, categories of game and SNS as well as game and finance have shown no significant correlations. This result clearly shows that mobile application usage behavior is quite clearly distinguishable - that the purpose of using mobile devices are polarized into utilitarian and hedonic purpose. This research proves several arguments that can only be explained by second-hand real data, not by survey data, and offers behavioral explanations of mobile application usage in consumers' perspectives. This research also shows substitution/supplementary patterns of consumer application usage, which then explain consumers' mobile application usage behaviors. However, this research has limitations in some points. Classification of categories itself is disputable, for classification is diverged among several studies. Therefore, there is a possibility of change in results depending on the classification. Lastly, although the data are collected in an individual application level, we reduce its observation into an individual level. Further research will be done to resolve these limitations.

Prediction of infectious diseases using multiple web data and LSTM (다중 웹 데이터와 LSTM을 사용한 전염병 예측)

  • Kim, Yeongha;Kim, Inhwan;Jang, Beakcheol
    • Journal of Internet Computing and Services
    • /
    • v.21 no.5
    • /
    • pp.139-148
    • /
    • 2020
  • Infectious diseases have long plagued mankind, and predicting and preventing them has been a big challenge for mankind. For this reasen, various studies have been conducted so far to predict infectious diseases. Most of the early studies relied on epidemiological data from the Centers for Disease Control and Prevention (CDC), and the problem was that the data provided by the CDC was updated only once a week, making it difficult to predict the number of real-time disease outbreaks. However, with the emergence of various Internet media due to the recent development of IT technology, studies have been conducted to predict the occurrence of infectious diseases through web data, and most of the studies we have researched have been using single Web data to predict diseases. However, disease forecasting through a single Web data has the disadvantage of having difficulty collecting large amounts of learning data and making accurate predictions through models for recent outbreaks such as "COVID-19". Thus, we would like to demonstrate through experiments that models that use multiple Web data to predict the occurrence of infectious diseases through LSTM models are more accurate than those that use single Web data and suggest models suitable for predicting infectious diseases. In this experiment, we predicted the occurrence of "Malaria" and "Epidemic-parotitis" using a single web data model and the model we propose. A total of 104 weeks of NEWS, SNS, and search query data were collected, of which 75 weeks were used as learning data and 29 weeks were used as verification data. In the experiment we predicted verification data using our proposed model and single web data, Pearson correlation coefficient for the predicted results of our proposed model showed the highest similarity at 0.94, 0.86, and RMSE was also the lowest at 0.19, 0.07.