• Title/Summary/Keyword: 트윗 분석

Search Result 128, Processing Time 0.028 seconds

A Generation and Matching Method of Normal-Transient Dictionary for Realtime Topic Detection (실시간 이슈 탐지를 위한 일반-급상승 단어사전 생성 및 매칭 기법)

  • Choi, Bongjun;Lee, Hanjoo;Yong, Wooseok;Lee, Wonsuk
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.13 no.5
    • /
    • pp.7-18
    • /
    • 2017
  • Recently, the number of SNS user has rapidly increased due to smart device industry development and also the amount of generated data is exponentially increasing. In the twitter, Text data generated by user is a key issue to research because it involves events, accidents, reputations of products, and brand images. Twitter has become a channel for users to receive and exchange information. An important characteristic of Twitter is its realtime. Earthquakes, floods and suicides event among the various events should be analyzed rapidly for immediately applying to events. It is necessary to collect tweets related to the event in order to analyze the events. But it is difficult to find all tweets related to the event using normal keywords. In order to solve such a mentioned above, this paper proposes A Generation and Matching Method of Normal-Transient Dictionary for realtime topic detection. Normal dictionaries consist of general keywords(event: suicide-death-loop, death, die, hang oneself, etc) related to events. Whereas transient dictionaries consist of transient keywords(event: suicide-names and information of celebrities, information of social issues) related to events. Experimental results show that matching method using two dictionary finds more tweets related to the event than a simple keyword search.

Real-time Category Trend Extraction Scheme based on Twitter Analysis (트위터 분석을 이용한 카테고리별 실시간 트렌드 추출 기법)

  • Na, ByeongJin;Kim, YongSung;Hwang, EenJun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.10a
    • /
    • pp.1581-1584
    • /
    • 2015
  • 최근 소셜 네트워크 서비스상의 데이터를 실시간으로 분석하여 의미있는 정보를 찾아내기 위한 연구가 활발하게 진행되고 있다. 특히, 스마트폰과 같은 스마트 디바이스를 이용하는 많은 사용자들이 실시간으로 발생하는 이벤트를 소셜 네트워크상에 게재하고 서로 공유하면서, 대중들이 관심을 가지는 토픽의 경우 굉장히 빠르게 확산되는 경향을 보이고 있다. 본 논문에서는 이러한 SNS의 특성을 토대로 트위터상의 트윗을 분석하여 여러 분야의 토픽들을 카테고리별로 분류하고, 카테고리별 트렌드를 추출하여 실시간으로 시각화하는 기법을 제안한다. 이를 위해, 트위터를 기반으로 SVM 분류 알고리즘과 Twitter-LDA를 통하여 트윗을 분야별로 분류하고, 각각의 트렌드를 이루는 대표적인 키워드를 선출하여 이를 기반으로 실시간 트렌드를 추출한다. 제안하는 기법의 성능을 평가하기 위해, 분류 특징 선택의 신뢰도를 측정한다.

Modeling Twitter Follower's Behavior Analysis (트위터에서 팔로워의 행태분석 모델)

  • Jeong, Kwang-Yong;Seol, Jae-Wook;Lee, Kyung-Soon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2012.11a
    • /
    • pp.604-607
    • /
    • 2012
  • 소셜 네트워크 서비스의 하나인 트위터는 팔로우를 통하여 사용자 간의 관계를 맺을 수 있다. 트위터 사용자들은 다양한 팔로워들이 존재한다. 이 팔로워들은 사용자에 대한 호감을 가지고 팔로우 하거나, 맹목적으로 추종하거나, 부정적인 의견을 지니고 사용자의 행동과 글을 관찰하기 위해 팔로우할 수도 있다. 본 논문에서 사용자에게 팔로워들이 어떠한 목적으로 그 사용자를 팔로워의 행태를 분석하는 모델을 제안한다. 대상사용자의 영향력 있는 팔로워를 추출하고, 팔로워의 리트윗 정보, 프로파일, 최신 트윗의 감정분석을 통해 지지자, 중립, 비지지자로 분류한다. 제안 방법의 유효성을 검증하기 위해 트윗 데이터에서 정치인과 언론인 5 명의 팔로워들 중 무작위로 3 만명을 추출하여 실험하였다. 실험 결과 영향력 있는 사용자 추출을 통한 지지 팔로워 추출이 효과적임을 알 수 있다.

Monitoring Mood Trends of Twitter Users using Multi-modal Analysis method of Texts and Images (텍스트 및 영상의 멀티모달분석을 이용한 트위터 사용자의 감성 흐름 모니터링 기술)

  • Kim, Eun Yi;Ko, Eunjeong
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.1
    • /
    • pp.419-431
    • /
    • 2018
  • In this paper, we propose a novel method for monitoring mood trend of Twitter users by analyzing their daily tweets for a long period. Then, to more accurately understand their tweets, we analyze all types of content in tweets, i.e., texts and emoticons, and images, thus develop a multimodal sentiment analysis method. In the proposed method, two single-modal analyses first are performed to extract the users' moods hidden in texts and images: a lexicon-based and learning-based text classifier and a learning-based image classifier. Thereafter, the extracted moods from the respective analyses are combined into a tweet mood and aggregated a daily mood. As a result, the proposed method generates a user daily mood flow graph, which allows us for monitoring the mood trend of users more intuitively. For evaluation, we perform two sets of experiment. First, we collect the data sets of 40,447 data. We evaluate our method via comparing the state-of-the-art techniques. In our experiments, we demonstrate that the proposed multimodal analysis method outperforms other baselines and our own methods using text-based tweets or images only. Furthermore, to evaluate the potential of the proposed method in monitoring users' mood trend, we tested the proposed method with 40 depressive users and 40 normal users. It proves that the proposed method can be effectively used in finding depressed users.

A Study on the Improvement and Analysis of SNS Operation Status on Disaster Information in Domestic and Foreign Public Institution (국내·외 기관의 재난정보관련 SNS 운용현황 및 개선방안에 관한 연구)

  • Doo, Hyo-Chul;Park, Jun-Hyeong;Kim, Hye-Young;Oh, Hyo-Jung;Kim, Yong
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.28 no.2
    • /
    • pp.57-78
    • /
    • 2017
  • SNS is a useful tool to quickly deliver information in an emergency given their speed and expandability. Especially, SNS in the event of a disaster or an accident can offer on-site, accurate and detailed updates about essential information such as the safety of victims and the development of the situation, served as a valuable complement to the conventional media. This study aims to perform a comparative analysis on how social media are currently used by emergency management authorities in South Korea and other countries. Based on the results, this study proposed more effective ways to exploit SNS and improve efficiency of disaster management. To accomplish the goals, this study collected tweet information from various sources including the FEMA of the U. S., the FDMA and the Central Disaster Council of Japan, and the MPSS of Korea. The collected tweet information was analyzed by feedback, time series, and information types. The feedback analysis aims to quantify the number of monthly user feedback in order to assess user satisfaction about the tweet information. The time series analysis identifies the number of tweet information, feedback index and keywords by country for certain duration, examining why certain messages showed high feedback indices and what kind of contents should be offered by the authorities. Finally, the analysis of information type reviews the type of information contained in the tweet information that drew users' attention to identify the information type in which the authorities should deliver information to users. Based on these analyses, this study proposed improvement methods to use Tweeter in MPSS.

Analysis of the Time-dependent Relation between TV Ratings and the Content of Microblogs (TV 시청률과 마이크로블로그 내용어와의 시간대별 관계 분석)

  • Choeh, Joon Yeon;Baek, Haedeuk;Choi, Jinho
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.163-176
    • /
    • 2014
  • Social media is becoming the platform for users to communicate their activities, status, emotions, and experiences to other people. In recent years, microblogs, such as Twitter, have gained in popularity because of its ease of use, speed, and reach. Compared to a conventional web blog, a microblog lowers users' efforts and investment for content generation by recommending shorter posts. There has been a lot research into capturing the social phenomena and analyzing the chatter of microblogs. However, measuring television ratings has been given little attention so far. Currently, the most common method to measure TV ratings uses an electronic metering device installed in a small number of sampled households. Microblogs allow users to post short messages, share daily updates, and conveniently keep in touch. In a similar way, microblog users are interacting with each other while watching television or movies, or visiting a new place. In order to measure TV ratings, some features are significant during certain hours of the day, or days of the week, whereas these same features are meaningless during other time periods. Thus, the importance of features can change during the day, and a model capturing the time sensitive relevance is required to estimate TV ratings. Therefore, modeling time-related characteristics of features should be a key when measuring the TV ratings through microblogs. We show that capturing time-dependency of features in measuring TV ratings is vitally necessary for improving their accuracy. To explore the relationship between the content of microblogs and TV ratings, we collected Twitter data using the Get Search component of the Twitter REST API from January 2013 to October 2013. There are about 300 thousand posts in our data set for the experiment. After excluding data such as adverting or promoted tweets, we selected 149 thousand tweets for analysis. The number of tweets reaches its maximum level on the broadcasting day and increases rapidly around the broadcasting time. This result is stems from the characteristics of the public channel, which broadcasts the program at the predetermined time. From our analysis, we find that count-based features such as the number of tweets or retweets have a low correlation with TV ratings. This result implies that a simple tweet rate does not reflect the satisfaction or response to the TV programs. Content-based features extracted from the content of tweets have a relatively high correlation with TV ratings. Further, some emoticons or newly coined words that are not tagged in the morpheme extraction process have a strong relationship with TV ratings. We find that there is a time-dependency in the correlation of features between the before and after broadcasting time. Since the TV program is broadcast at the predetermined time regularly, users post tweets expressing their expectation for the program or disappointment over not being able to watch the program. The highly correlated features before the broadcast are different from the features after broadcasting. This result explains that the relevance of words with TV programs can change according to the time of the tweets. Among the 336 words that fulfill the minimum requirements for candidate features, 145 words have the highest correlation before the broadcasting time, whereas 68 words reach the highest correlation after broadcasting. Interestingly, some words that express the impossibility of watching the program show a high relevance, despite containing a negative meaning. Understanding the time-dependency of features can be helpful in improving the accuracy of TV ratings measurement. This research contributes a basis to estimate the response to or satisfaction with the broadcasted programs using the time dependency of words in Twitter chatter. More research is needed to refine the methodology for predicting or measuring TV ratings.

The Study on the Activation of Public Library Services Utilizing Twitter (트위터를 활용한 공공도서관 서비스 활성화 방안 연구)

  • Oh, Eui-Kyung
    • Journal of Information Management
    • /
    • v.43 no.2
    • /
    • pp.133-150
    • /
    • 2012
  • This study showed the activation of public library services utilizing twitter. Top five American public library twitter's 1,373 tweets collected, analyzed by content types and examined applicability into public library services. Based on the results, it suggested that public library services can be activated by auto-tweeting informations within home page, re-tweeting of timely informations, generating HASH tag, using diverse social medias, active re-tweeting/replying, and utilizing twitter programs such as twit-bot. Finally, the study proposed that evaluations about twitter services such as satisfaction survey should be carried out.

A Method for Detecting Event-location using Relevant Words Clustering in Tweet (트위터에서의 연관어 군집화를 이용한 이벤트 지역 탐지 기법)

  • Ha, Hyunsoo;Woo, Seungmin;Yim, Junyeob;Hwang, Byung-Yeon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.04a
    • /
    • pp.680-682
    • /
    • 2015
  • 최근 스마트폰의 보급으로 소셜 네트워크 서비스를 이용하는 사용자들이 급증하였다. 그 중 트위터는 정보의 빠른 전파력과 확산성으로 인해 현실에서 발생한 이벤트를 탐지하는 도구로 활용하는 것이 가능하다. 따라서 트위터 사용자 개개인을 하나의 센서로 가정하고 그들이 작성한 트윗 텍스트를 분석한다면 이벤트 탐지의 도구로써 활용할 수 있다. 이와 관련된 연구들은 이벤트 발생 위치를 추적하기 위해 GPS좌표를 이용하지만 트위터 사용자들이 위치정보 공개에 회의적인 점을 감안하면 명확한 한계점으로 제시될 수 있다. 이에 본 논문에서는 트위터에서 제공하는 위치정보를 이용하지 않고, 트윗 텍스트에서 위치정보를 추적하는 방법을 제시하였다. 트윗 텍스트에서 키워드간의 관계를 고려하여 이벤트의 사실여부를 결정하였으며, 실험을 통해 기존 매체들보다 빠른 탐지를 보임으로써 제안된 시스템의 필요성을 보였다.

Differences in Sentiment on SNS: Comparison among Six Languages (SNS에서의 언어 간 감성 차이 연구: 6개 언어를 중심으로)

  • Kim, Hyung-Ho;Jang, Phil-Sik
    • Journal of Digital Convergence
    • /
    • v.14 no.3
    • /
    • pp.165-170
    • /
    • 2016
  • The purpose of this study was to explore the differences in sentiment on social networking sites among six languages (English, German, Russian, Spanish, Turkish and Dutch). A total of 204 million tweets were collected using Streaming API. Subjective/objective ratio, sentiment strength, positive/negative ratio, number of retweets and boundary impermeability were analyzed with SentiStrength to estimate the trends of emotional expression via Twitter. The results showed that subjective/objective ratio and the positive/negative ratio of tweets were significantly different by languages (p<0.001). And, there were significant effects of language on sentiment strength, boundary impermeability and the number of retweets (p<0.001). The results also indicate that the cross-cultural, language differences should be taken into account in sentiment analysis on SNS.

Tweets analysis using a Dynamic Topic Modeling : Focusing on the 2019 Koreas-US DMZ Summit (트윗의 타임 시퀀스를 활용한 DTM 분석 : 2019 남북미정상회동 이벤트를 중심으로)

  • Ko, EunJi;Choi, SunYoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.2
    • /
    • pp.308-313
    • /
    • 2021
  • In this study, tweets about the 2019 Koreas-US DMZ Summit were collected along with a time sequence and analyzed by a sequential topic modeling method, Dynamic Topic Modeling(DTM). In microblogging services such as Twitter, unstructured data that mixes news and an opinion about a single event occurs at the same time on a large scale, and information and reactions are produced in the same message format. Therefore, to grasp a topic trend, the contextual meaning can be found only by performing pattern analysis reflecting the characteristics of sequential data. As a result of calculating the DTM after obtaining the topic coherence score and evaluating the Latent Dirichlet Allocation(LDA), 30 topics related to news reports and opinions were derived, and the probability of occurrence of each topic and keywords were dynamically evolving. In conclusion, the study found that DTM is a suitable model for analyzing the trend of integrated topics in a specific event over time.