• Title/Summary/Keyword: 트윗 분류

Search Result 41, Processing Time 0.02 seconds

Analysis System for SNS Issues per Country based on Topic Model (토픽 모델 기반의 국가 별 SNS 관심 이슈 분석 시스템)

  • Kim, Seong Hoon;Yoon, Ji Won
    • Journal of KIISE
    • /
    • v.43 no.11
    • /
    • pp.1201-1209
    • /
    • 2016
  • As the use of SNS continues to increase, various related studies have been conducted. According to the effectiveness of the topic model for existing theme extraction, a huge number of related research studies on topic model based analysis have been introduced. In this research, we suggested an automation system to analyze topics of each country and its distribution in twitter by combining world map visualization and issue matching method. The core system components are the following three modules; 1) collection of tweets and classification by nation, 2) extraction of topics and distribution by country based on topic model algorithm, and 3) visualization of topics and distribution based on Google geochart. In experiments with USA and UK, we could find issues of the two nations and how they changed. Based on these results, we could analyze the differences of each nation's position on ISIS problem.

Semi-supervised learning for sentiment analysis in mass social media (대용량 소셜 미디어 감성분석을 위한 반감독 학습 기법)

  • Hong, Sola;Chung, Yeounoh;Lee, Jee-Hyong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.5
    • /
    • pp.482-488
    • /
    • 2014
  • This paper aims to analyze user's emotion automatically by analyzing Twitter, a representative social network service (SNS). In order to create sentiment analysis models by using machine learning techniques, sentiment labels that represent positive/negative emotions are required. However it is very expensive to obtain sentiment labels of tweets. So, in this paper, we propose a sentiment analysis model by using self-training technique in order to utilize "data without sentiment labels" as well as "data with sentiment labels". Self-training technique is that labels of "data without sentiment labels" is determined by utilizing "data with sentiment labels", and then updates models using together with "data with sentiment labels" and newly labeled data. This technique improves the sentiment analysis performance gradually. However, it has a problem that misclassifications of unlabeled data in an early stage affect the model updating through the whole learning process because labels of unlabeled data never changes once those are determined. Thus, labels of "data without sentiment labels" needs to be carefully determined. In this paper, in order to get high performance using self-training technique, we propose 3 policies for updating "data with sentiment labels" and conduct a comparative analysis. The first policy is to select data of which confidence is higher than a given threshold among newly labeled data. The second policy is to choose the same number of the positive and negative data in the newly labeled data in order to avoid the imbalanced class learning problem. The third policy is to choose newly labeled data less than a given maximum number in order to avoid the updates of large amount of data at a time for gradual model updates. Experiments are conducted using Stanford data set and the data set is classified into positive and negative. As a result, the learned model has a high performance than the learned models by using "data with sentiment labels" only and the self-training with a regular model update policy.

Measuring Similarity Between Movies Based on Sentiment of Tweets (트위터를 활용한 감성 기반의 영화 유사도 측정)

  • Kim, Kyoungmin;Kim, Dong-Yun;Lee, Jee-Hyong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.3
    • /
    • pp.292-297
    • /
    • 2014
  • As a Social Network Service (SNS) has become an integral part of our everyday lives, millions of users can express their opinion and share information regardless of time and place. Hence sentiment analysis using micro-blogs has been studied in various field to know people's opinion on particular topics. Most of previous researches on movie reviews consider only positive and negative sentiment and use it to predict movie rating. As people feel not only positive and negative but also various emotion, the sentiment that people feel while watching a movie need to be classified in more detail to extract more information than personal preference. We measure sentiment distributions of each movie from tweets according to the Thayer's model. Then, we find similar movies by calculating similarity between each sentiment distributions. Through the experiments, we verify that our method using micro-blogs performs better than using only genre information of movies.

Spatial Clustering Analysis based on Text Mining of Location-Based Social Media Data (위치기반 소셜 미디어 데이터의 텍스트 마이닝 기반 공간적 클러스터링 분석 연구)

  • Park, Woo Jin;Yu, Ki Yun
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.23 no.2
    • /
    • pp.89-96
    • /
    • 2015
  • Location-based social media data have high potential to be used in various area such as big data, location based services and so on. In this study, we applied a series of analysis methodology to figure out how the important keywords in location-based social media are spatially distributed by analyzing text information. For this purpose, we collected tweet data with geo-tag in Gangnam district and its environs in Seoul for a month of August 2013. From this tweet data, principle keywords are extracted. Among these, keywords of three categories such as food, entertainment and work and study are selected and classified by category. The spatial clustering is conducted to the tweet data which contains keywords in each category. Clusters of each category are compared with buildings and benchmark POIs in the same position. As a result of comparison, clusters of food category showed high consistency with commercial areas of large scale. Clusters of entertainment category corresponded with theaters and sports complex. Clusters of work and study showed high consistency with areas where private institutes and office buildings are concentrated.

Airline Customer Satisfaction Analysis using Social Media Sentiment Evaluation: Full Service Carriers vs. Low Cost Carriers (소셜 미디어 감성평가를 활용한 항공사 고객만족도 분석 - 대형항공사와 저비용항공사 비교연구)

  • Lee, Ju-Yang;Jang, Phil-Sik
    • Journal of Digital Convergence
    • /
    • v.15 no.6
    • /
    • pp.189-196
    • /
    • 2017
  • This study investigates customer satisfaction with full service carriers (FSC) and low cost carriers (LCC) using social media sentiment evaluation. From 2008 to 2016, a total of 77,591 tweets about two FSC and six LCC were aggregated and classified as per airline choice factors. Sentiment evaluation was employed to assess customer satisfaction by three appraisers. The results showed that customer satisfaction with LCC was significantly higher (p<0.001) compared to FSC. Furthermore, overall customer satisfaction with both FSC and LCC has been facing a consistent downward trend since the last seven years. The results also highlighted low customer satisfaction with respect to booking and flight operation factors, and a steep decline in customer satisfaction across booking, onboard services, and marketing factors for FSC. The results of this study have practical implications for the airline industry, which can use this quantitative data to improve customer satisfaction with FSC and LCC.

A Method of Identifying Ownership of Personal Information exposed in Social Network Service (소셜 네트워크 서비스에 노출된 개인정보의 소유자 식별 방법)

  • Kim, Seok-Hyun;Cho, Jin-Man;Jin, Seung-Hun;Choi, Dae-Seon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.23 no.6
    • /
    • pp.1103-1110
    • /
    • 2013
  • This paper proposes a method of identifying ownership of personal information in Social Network Service. In detail, the proposed method automatically decides whether any location information mentioned in twitter indicates the publisher's residence area. Identifying ownership of personal information is necessary part of evaluating risk of opened personal information online. The proposed method uses a set of decision rules that considers 13 features that are lexicographic and syntactic characteristics of the tweet sentences. In an experiment using real twitter data, the proposed method shows better performance (f1-score: 0.876) than the conventional document classification models such as naive bayesian that uses n-gram as a feature set.

A Study on Analyzing Sentiments on Movie Reviews by Multi-Level Sentiment Classifier (영화 리뷰 감성분석을 위한 텍스트 마이닝 기반 감성 분류기 구축)

  • Kim, Yuyoung;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.71-89
    • /
    • 2016
  • Sentiment analysis is used for identifying emotions or sentiments embedded in the user generated data such as customer reviews from blogs, social network services, and so on. Various research fields such as computer science and business management can take advantage of this feature to analyze customer-generated opinions. In previous studies, the star rating of a review is regarded as the same as sentiment embedded in the text. However, it does not always correspond to the sentiment polarity. Due to this supposition, previous studies have some limitations in their accuracy. To solve this issue, the present study uses a supervised sentiment classification model to measure a more accurate sentiment polarity. This study aims to propose an advanced sentiment classifier and to discover the correlation between movie reviews and box-office success. The advanced sentiment classifier is based on two supervised machine learning techniques, the Support Vector Machines (SVM) and Feedforward Neural Network (FNN). The sentiment scores of the movie reviews are measured by the sentiment classifier and are analyzed by statistical correlations between movie reviews and box-office success. Movie reviews are collected along with a star-rate. The dataset used in this study consists of 1,258,538 reviews from 175 films gathered from Naver Movie website (movie.naver.com). The results show that the proposed sentiment classifier outperforms Naive Bayes (NB) classifier as its accuracy is about 6% higher than NB. Furthermore, the results indicate that there are positive correlations between the star-rate and the number of audiences, which can be regarded as the box-office success of a movie. The study also shows that there is the mild, positive correlation between the sentiment scores estimated by the classifier and the number of audiences. To verify the applicability of the sentiment scores, an independent sample t-test was conducted. For this, the movies were divided into two groups using the average of sentiment scores. The two groups are significantly different in terms of the star-rated scores.

A Comparative Study on Using SentiWordNet for English Twitter Sentiment Analysis (영어 트위터 감성 분석을 위한 SentiWordNet 활용 기법 비교)

  • Kang, In-Su
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.23 no.4
    • /
    • pp.317-324
    • /
    • 2013
  • Twitter sentiment analysis is to classify a tweet (message) into positive and negative sentiment class. This study deals with SentiWordNet(SWN)-based twitter sentiment analysis. SWN is a sentiment dictionary in which each sense of an English word has a positive and negative sentimental strength. There has been a variety of SWN-based sentiment feature extraction methods which typically first determine the sentiment orientation (SO) of a term in a document and then decide SO of the document from such terms' SO values. For example, for SO of a term, some calculated the maximum or average of sentiment scores of its senses, and others computed the average of the difference of positive and negative sentiment scores. For SO of a document, many researchers employ the maximum or average of terms' SO values. In addition, the above procedure may be applied to the whole set (adjective, adverb, noun, and verb) of parts-of-speech or its subset. This work provides a comparative study on SWN-based sentiment feature extraction schemes with performance evaluation on a well-known twitter dataset.

The Study on the Public Typology based on Twitter's Political Opinion Analysis: Focusing on 10.26 by-election of Mayor of Seoul (트위터에서 형성된 정치적 의견 분석을 통한 분화된 공중 연구: 10.26 서울시장 재보궐 선거를 중심으로)

  • Hong, Ju-Hyun;Lee, Chang-Hyun
    • Korean journal of communication and information
    • /
    • v.59
    • /
    • pp.138-161
    • /
    • 2012
  • This study is designed to explore the function of Twitter as a campaign platform during election campaign. For exploring the function of Twitter the form of tweet, the type of information on tweet and the way of opinion expression via Twitter were discussed by content analysis. This study finds, first, that, netizens express their opoinion of candidates without foundation and with emotional reactions. Second, they showed somewhat conflictive reactions according to their supporting candidates. This study conceptualized various kinds of public as 'blindly support public,' and 'blindly opposition public' in case of Park's supporters, 'rational support public,' and 'critical opposition public' in case of Na's supporters. Third, Park's supporters debated Na candidate's attitude of debate and her appearance blindly without foundation. Na's supporters argued Park's attitude of debate and his ignorance of Seoul Metropolitan government's policy blindly without foundation. Finally, this study discussed the relationship between the political discourse according to netizens' supporting via Twitter and the results of election. Park whose supporters attacked the opposing candidate by blaming her appearance and her attitude of debate won the election. Na didn't overcome her negative images. For her Twitter functioned as a media which is spreading negative factors about her. In conclusion, Twitter as a campaign platform during election times plays a key role in discussing candidates. However, netizens need to express their opinions with foundation and the candidates have to consider negative issue management. This study highlights the importance of peripheral factors which have a decisive effect on the results of election. The results of this study is useful for building political campaign strategy by candidates.

  • PDF

Personalized Clothing and Food Recommendation System Based on Emotions and Weather (감정과 날씨에 따른 개인 맞춤형 옷 및 음식 추천 시스템)

  • Ugli, Sadriddinov Ilkhomjon Rovshan;Park, Doo-Soon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.11
    • /
    • pp.447-454
    • /
    • 2022
  • In the era of the 4th industrial revolution, we are living in a flood of information. It is very difficult and complicated to find the information people need in such an environment. Therefore, in the flood of information, a recommendation system is essential. Among these recommendation systems, many studies have been conducted on each recommendation system for movies, music, food, and clothes. To date, most personalized recommendation systems have recommended clothes, books, or movies by checking individual tendencies such as age, genre, region, and gender. Future generations will want to be recommended clothes, books, and movies at once by checking age, genre, region, and gender. In this paper, we propose a recommendation system that recommends personalized clothes and food at once according to the user's emotions and weather. We obtained user data from Twitter of social media and analyzed this data as user's basic emotion according to Paul Eckman's theory. The basic emotions obtained in this way were converted into colors by applying Hayashi's Quantification Method III, and these colors were expressed as recommended clothes colors. Also, the type of clothing is recommended using the weather information of the visualcrossing.com API. In addition, various foods are recommended according to the contents of comfort food according to emotions.