• Title/Summary/Keyword: 트윗 수집

Search Result 72, Processing Time 0.027 seconds

Natural Language Processing-based Personalized Twitter Recommendation System (자연어 처리 기반 맞춤형 트윗 추천 시스템)

  • Lee, Hyeon-Chang;Yu, Dong-Pil;Jung, Ga-Bin;Nam, Yong-Wook;Kim, Yong-Hyuk
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.12
    • /
    • pp.39-45
    • /
    • 2018
  • Twitter users use 'Following', 'Retweet' and so on to find tweets that they are interested in. However, it is difficult for users to find tweets that are of interest to them on Twitter, which has more than 300 million users. In this paper, we developed a customized tweet recommendation system to resolve it. First, we gather current trends to collect tweets that are worth recommending to users and popular tweets that talk about trends. Later, to analyze users and recommend customized tweets, the users' tweets and the collected tweets are categorized. Finally, using Web service, we recommend tweets that match with user categorization and users whose interests match. Consequentially, we recommended 67.2% of proper tweet.

A Sentiment Analysis Tool for Korean Twitter (한국어 트위터의 감정 분석 도구)

  • Seo, Hyung-Won;Jeon, Kil-Ho;Choi, Myung-Gil;Nam, Yoo-Rim;Kim, Jae-Hoon
    • Annual Conference on Human and Language Technology
    • /
    • 2011.10a
    • /
    • pp.94-97
    • /
    • 2011
  • 본 논문은 자동으로 한글 트위터 메시지(트윗: tweet)에 포함된 감정을 분석하는 방법에 대하여 기술한다. 제안된 시스템에 의하여 수집된 트윗들은 어떤 질의에 대해 긍정 혹은 부정으로 분류된다. 이것은 일반적으로 어떤 상품을 구매하기 원하는 고객이나, 상품에 대한 고객들의 평가를 수집하기 원하는 기업에게 유용하다. 영문 트윗에 대한 연구는 이미 활발하게 진행되고 있지만 한글 트윗, 특히 감정 분류에 대한 연구는 아직 공개된 것이 없다. 수집된 트윗들은 기계 학습(Naive Bayes, Maximum Entropy, 그리고 SVM)을 이용하여 분류하였고 한글 특성에 따라 자질 선택의 기본 단위를 2음절과 3음절로 나누어 실험하였다. 기존의 영어에 대한 연구는 80% 이상의 정확도를 가지는 반면에, 본 실험에서는 60% 정도의 정확도를 얻을 수 있었다.

  • PDF

Hashtag Analysis Scheme for Topic based Tweet Categorization (토픽 기반의 트윗 분류를 위한 해시태그 분석 기법)

  • Kim, Yongsung;Jun, Sanghoon;Rew, Jehyeok;Hwang, Eenjun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2014.11a
    • /
    • pp.737-740
    • /
    • 2014
  • 최근 SNS 사용자가 급증하면서 매우 다양하고 방대한 양의 글이 여러 종류의 SNS를 통해 생성되고 있다. 그중 트위터는 정보의 전달 및 확산에 상당히 유용한 도구로 사용되고 있다. 이러한 트위터의 사용자 트윗은 뉴스, 음악, 사진, 여행 등 다양한 형태로 등장한다. 또한 트위터는 해시태그라는 사용자 정의 태그를 사용하는데 이는 트윗의 키워드 및 핵심을 쉽게 표현할 수 있도록 해주는 효과적인 수단이다. 최근 상당히 많은 양의 트윗의 생성에도 불구하고 이를 다양한 카테고리별로 분류할 수 있는 연구가 많이 진행되지 않았다. 따라서 본 논문에서는 해시태그를 이용해 트윗의 핵심을 파악하고 수많은 트윗을 다양한 토픽별로 분류할 수 있는 기법을 제안한다. 우선 다양한 카테고리의 인기 해시태그가 포함된 트윗을 수집하고 수집한 트윗에서 해시태그별 키워드를 추출한다. 그리고 코사인 유사도를 통해 해시태그별 내용 유사도를 파악하여 각 카테고리 내의 해시태그가 얼마나 유사한 내용을 지니고 있는지 파악한다. 마지막으로 사용자 트윗이 입력되면 모든 카테고리와 유사도를 비교하여 가장 유사도가 높은 카테고리를 찾아 추천해준다. 제안된 기법을 바탕으로 프로토타입을 구현하고 실험을 통해 성능을 평가한다.

Dynamic Seed Selection for Twitter Data Collection (트위터 데이터 수집을 위한 동적 시드 선택)

  • Lee, Hyoenchoel;Byun, Changhyun;Kim, Yanggon;Lee, Sang Ho
    • Journal of KIISE:Databases
    • /
    • v.41 no.4
    • /
    • pp.217-225
    • /
    • 2014
  • Analysis of social media such as Twitter can yield interesting perspectives to understanding human behavior, detecting hot issues, identifying influential people, or discovering a group and community. However, it is difficult to gather the data relevant to specific topics due to the main characteristics of social media data; data is large, noisy, and dynamic. This paper proposes a new algorithm that dynamically selects the seed nodes to efficiently collect tweets relevant to topics. The algorithm utilizes attributes of users to evaluate the user influence, and dynamically selects the seed nodes during the collection process. We evaluate the proposed algorithm with real tweet data, and get satisfactory performance results.

Twitter HashTag Recommendation Scheme based on Similar Tweet Analysis (유사 트윗 분석에 기반한 트위터 해시태그 추천기법)

  • Jeon, Mina;Jun, Sanghoon;Hwang, Eenjun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.11a
    • /
    • pp.962-963
    • /
    • 2013
  • 트위터 해시태그(#, HashTag)는 트윗(Tweets)에서 특정 키워드나 내용을 주제별로 분류하고 검색을 보다 효율적으로 사용하기 위한 사용자 정의 태그이다. 사용자가 정의하기에 따라 다양한 형태로 작성되기 때문에 오히려 검색의 효율성이 떨어질 수 있으며, 사용자는 자신이 작성한 트윗에 어떤 해시태그를 추가해야 하는지에 대한 궁금증이 생기는 경우가 발생한다. 본 논문에서는 이러한 문제를 해결하기 위해 사용자가 작성한 트윗에 적합한 해시태그를 추천하는 기법을 제안한다. 수집한 트윗과 해시태그의 키워드를 추출하고 트윗의 유사도를 계산하기 위해 TF-IDF와 Cosine Similarity를 적용하여 유사한 트윗을 갖는 해시태그를 추천한다. 본 논문에서 제안된 기법을 검증하기 위한 실험으로 추천의 정확성을 평가했다.

Characteristics of Interactions between Fan and Celebrities on Twitter (유명인과의 트위터 매개 상호작용 특성 탐색)

  • Hwang, Yoosun
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.8
    • /
    • pp.72-82
    • /
    • 2013
  • The present study explored types of Twitter-mediated communication and emotional responses of Twitter users toward celebrities. Three perspectives of para-social interactions, information hub, and fandom were proposed as communication types on Twitter. Celebrities were classified by entertainer, politician, specialist, and blogger. Communication patterns according to each category of celebrities were analyzed. The patterns of emotional responses, which represents the use of emoticons and emotional expressions were also analyzed. The results show that the type of para-social interactions was frequently accepted for the interactions with politicians and specialists, while fandom style was salient for the entertainers. For the power bloggers, the users tend to adopt the type of information hub interaction. The use of emotions and emotional expressions were most frequent in case of fandom style communication and the messages to the entertainers. Implications were further discussed.

A study on the issue analysis of National Archives of Korea based on SNS(tweet) analysis between 2014~2015 (2014년~2015년 국가기록원 관련 트윗 이슈분석)

  • Seo, Ji-Won;Park, Jun-Hyeong;Oh, Hyo-Jung;Youn, Eunha
    • The Korean Journal of Archival Studies
    • /
    • no.50
    • /
    • pp.139-175
    • /
    • 2016
  • This study is a content analysis on the National Archives of Korea as reflected in tweets produced between 2014 and 2015. The study thus collected all tweets that used the key word 'National Archives of Korea' from 2014 and 2015. The contents of the tweets, including their category and issues mention, were then analyzed. The results of the analysis were as follows. First, the analysis showed that the collected archives of the National Archives had increased their volume in over two years, which have a similar type and pattern in their content. Second, the tweets produced by the public reflects more current political and social issues rather than archival service.

Real-time Spatial Recommendation System based on Sentiment Analysis of Twitter (트위터의 감정 분석을 통한 실시간 장소 추천 시스템)

  • Oh, Pyeonghwa;Hwang, Byung-Yeon
    • The Journal of Society for e-Business Studies
    • /
    • v.21 no.3
    • /
    • pp.15-28
    • /
    • 2016
  • This paper proposes a system recommending spatial information what user wants with collecting and analyzing tweets around the user's location by using the GPS information acquired in mobile. This system has built an emotion dictionary and then derive the recommendation score of morphological analyzed tweets to provide not just simple information but recommendation through the emotion analysis information. The system also calculates distance between the recommended tweets and user's latitude-longitude coordinates and the results showed the close order. This paper evaluates the result of the emotion analysis in a total of 10 areas with two keyword 'Restaurants' and 'Performance.' In the result, the number of tweets containing the words positive or negative are 122 of the total 210. In addition, 65 tweets classified as positive or negative by analyzing emotions after a morphological analysis and only 46 tweets contained the meaning of the positive or negative actually. This result shows the system detected tweets containing the emotional element with recall of 38% and performed emotion analysis with precision of 71%.

Tweet Acquisition System by Considering Location Information and Tendency of Twitter User (트위터 사용자의 위치정보와 성향을 고려한 트윗 수집 시스템)

  • Choi, Woosung;Yim, Junyeob;Hwang, Byung-Yeon
    • Spatial Information Research
    • /
    • v.22 no.3
    • /
    • pp.1-8
    • /
    • 2014
  • While SNS services such as Twitter or Facebook are rapidly growing, research for the SNS analysis has been concerned. Especially, twitter reacts to social issues in real-time so that it is used to get useful experimental data for researchers of social science or information retrieval. However, it is still lack of research on the methodology to collect data. Therefore, this paper suggests the tweet acquisition system by considering tendency of twitter user oriented location-based event and political social event. First the system acquires tweets including information of location and keyword about event and secure IDs for acquisition of political social event. Then we plan ID-analyzer to classify the tendency of users. In addition for measuring reliability of ID-analyzer, it acquires and analyzes the tweet by using high-ranked ID. In analyses result, top-ranked ID shows 88.8% reliability, 2nd-ranked ID shows 76.05% and ID-analyzer shows 77.5%, it shortens collection time by using minority ID.

Twitter Sentiment Analysis for the Recent Trend Extracted from the Newspaper Article (신문기사로부터 추출한 최근동향에 대한 트위터 감성분석)

  • Lee, Gyoung Ho;Lee, Kong Joo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.10
    • /
    • pp.731-738
    • /
    • 2013
  • We analyze public opinion via a sentiment analysis of tweets collected by using recent topic keywords extracted from newspaper articles. Newspaper articles collected within a certain period of time are clustered by using K-means algorithm and topic keywords for each cluster are extracted by using term frequency. A sentiment analyzer learned by a machine learning method can classify tweets according to their polarity values. We have an assumption that tweets collected by using these topic keywords deal with the same topics as the newspaper articles mentioned if the tweets and the newspapers are generated around the same time. and we tried to verify the validity of this assumption.