Search | Korea Science

Spatial Distribution Patterns of Twitter Data with Topic Modeling (토픽 모델링을 이용한 트위터 데이터의 공간 분포 패턴 분석)

Woo, Hyun Jee;Kim, Young Hoon
- Journal of the Korean association of regional geographers
- /
- v.23 no.2
- /
- pp.376-387
- /
- 2017
This paper attempts to analyze the geographical characters of Twitter data and presents analysis potentials for social network analysis in geography. First, this paper suggests a methodology for a topic modeling-based approach in order to identify the geographical characteristics of tweets, including an analysis flow of Twitter data sets, tweet data collection and conversion, textural pre-processing and structural analysis, topic discovery, and interpretation of tweets' topics. GPS coordinates referencing tweets(geotweets) were extracted among sampled Twitter data sets because it contains the tweet place where it was created. This paper identifies a correlated relationship between some specific topics and local places in Jeju. This correlation is closely associated with some place names and local sites in Jeju Island. We assume it is the intention of tweeters to record their tweet places and to share and retweet with other tweeters in some cases. A surface density map shows the hotspots of tweets, detecting around some specific places and sites such as Jeju airport, sightseeing sites, and local places in Jeju Island. The hotspots show similar patterns of the floating population of Jeju, especially the thirty-year age group. In addition, a topic modeling algorithm is applied for the geographical topic discovery and comparison of the spatial patterns of tweets. Finally, this empirical analysis presents that Twitter data, as social network data, provide geographical significance, with topic modeling approach being useful in analyzing the textural features reflecting the geographical characteristics in large data sets of tweets.
PDF

소셜 데이터에서 재난 사건 추출을 위한 사용자 행동 및 시간 분석을 반영한 토픽 모델

;Lee, Gyeong-Sun
- Information and Communications Magazine
- /
- v.34 no.6
- /
- pp.43-50
- /
- 2017
본고에서는 소셜 빅데이터에서 공공안전에 위협되고 사회적으로 이슈가 되는 재난사건을 추출하기 위한 방법으로 소셜 네트워크상에서 사용자 행동 분석과 시간분석을 반영한 토픽 모델링 기법을 알아본다. 소셜 사용자의 글 수, 리트윗 반응, 활동주기, 팔로워 수, 팔로잉 수 등 사용자의 행동 분석을 통하여 활동적이고 신뢰성 있는 사용자를 분류함으로써 트윗에서 스팸성과 광고성을 제외하고 이슈에 대해 신뢰성 높은 사용자가 쓴 트윗을 중요하게 반영한다. 또한, 트위터 데이터에서 새로운 이슈가 발생한 것을 탐지하기 위해 시간별 핵심어휘 빈도의 분포 변화를 측정하고, 이슈 트윗에 대해 감성 표현 분석을 통해 핵심이슈에 대해 사건 어휘를 추출한다. 소셜 빅데이터의 특성상 같은 날짜에 여러 이슈에 대한 트윗이 많이 생성될 수 있기 때문에, 트윗들을 토픽별로 그룹핑하는 것이 필요하므로, 최근 많이 사용되고 있는 LDA 토픽모델링 기법에 시간 특성과 사용자 특성을 분석한 시간상에서의 중요한 사건 어휘를 반영하고, 해당이슈에 대한 신뢰성 있는 사용자가 쓴 트윗을 중요시 반영하도록 토픽모델링 기법을 개선한 소셜 사건 탐지 방법에 대해 알아본다.
PDF KSCI

Development of Restaurant Recommendation System Using K-Pop Hashtag Crawling (K-POP 연관 해시태그 크롤링을 이용한 맛집 추천 시스템 개발)

Kim, Hwa-Seon;Lee, Chae-Yeon;Cho, Seo-Yun;Nah, Jeong-Eun
- Proceedings of the Korea Information Processing Society Conference
- /
- 2022.11a
- /
- pp.878-880
- /
- 2022
COVID-19 상황 속에서도 전 세계 Twitter K-POP 콘텐츠 관련 트윗 양은 78억 건 이상으로 매년 성장세를 보인다. Twitter 내 K-POP 팬들은 아티스트 관련 해시태그를 포함한 트윗을 작성하여 같은 팬덤끼리 실시간으로 정보를 전달하고 생산한다. 이러한 맛집 트윗들은 K-POP 팬들이 Twitter 내에서 신뢰도 있는 맛집 정보를 얻는 용도로 사용된다. 하지만 팬들이 정보를 얻기 위해서는 여러 맛집 해시태그로 검색하고 리트윗 수가 많은 트윗을 직접 찾아야 한다. 기존의 맛집 추천 시스템은 서비스 제공자 중심의 구조를 띤다. 서비스 제공자가 일방적으로 정보를 전달하거나, 사용자 리뷰 갱신 간격이 길다는 한계가 존재한다. 본 논문에서는 Twitter 내 K-POP 맛집 해시태그가 포함된 트윗을 Twitter API와 Tweepy를 사용하여 크롤링하였다. 수집한 데이터의 좋아요 수와 리트윗 수를 바탕으로 데이터 필터링을 진행하여 bot user와 광고 계정이 제외된 맛집 관련 트윗을 추출한다. 최종적으로는 추출한 트윗의 정보를 마커로 표시하여 웹 사이트를 제작하였다. K-POP 팬들은 맛집 해시태그를 검색하여 일일이 찾을 필요 없이 웹 사이트에 방문하여 맛집 위치를 확인할 수 있다. 웹 사이트 사용자의 위치가 지도상에 표시되어 가까운 맛집을 찾기도 편리하다. 본 논문에서는 맛집의 위치를 서대문구로 한정하여 진행했다.
https://doi.org/10.3745/PKIPS.y2022m11a.878 인용 PDF

Relationship Between Tweet Frequency and User Velocity on Twitter (트위터에서 트윗 주기와 사용자 속도 사이 관계)

Jeon, So-Young;Lee, Al-Chan;Seo, Go-Eun;Shin, Won-Yong
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.19 no.6
- /
- pp.1380-1386
- /
- 2015
Recently, the importance of users' geographic location information has been highlighted with a rapid increase of online social network services. In this paper, by utilizing geo-tagged tweets that provides high-precision location information of users, we first identify both Twitter users' exact location and the corresponding timestamp when the tweet was sent. Then, we analyze a relationship between the tweet frequency and the average user velocity. Specifically, we introduce a tweet-frequency computing algorithm, and show analysis results by country and by city. As a main result, it is shown that the tweet frequency according to user velocity follows a power-law distribution (i.e., Zipf' distribution or a Pareto distribution). In addition, by performing a comparison between the United States and Japan, one can see that the exponent of the distribution in Japan is smaller than that in the United States.
https://doi.org/10.6109/jkiice.2015.19.6.1380 인용 PDF KSCI KPUBS HTML

Characteristics of Interactions between Fan and Celebrities on Twitter (유명인과의 트위터 매개 상호작용 특성 탐색)

Hwang, Yoosun
- The Journal of the Korea Contents Association
- /
- v.13 no.8
- /
- pp.72-82
- /
- 2013
The present study explored types of Twitter-mediated communication and emotional responses of Twitter users toward celebrities. Three perspectives of para-social interactions, information hub, and fandom were proposed as communication types on Twitter. Celebrities were classified by entertainer, politician, specialist, and blogger. Communication patterns according to each category of celebrities were analyzed. The patterns of emotional responses, which represents the use of emoticons and emotional expressions were also analyzed. The results show that the type of para-social interactions was frequently accepted for the interactions with politicians and specialists, while fandom style was salient for the entertainers. For the power bloggers, the users tend to adopt the type of information hub interaction. The use of emotions and emotional expressions were most frequent in case of fandom style communication and the messages to the entertainers. Implications were further discussed.
https://doi.org/10.5392/JKCA.2013.13.08.072 인용 PDF KSCI

Comparative Study of Various Machine-learning Features for Tweets Sentiment Classification (트윗 감정 분류를 위한 다양한 기계학습 자질에 대한 비교 연구)

Hong, Cho-Hee;Kim, Hark-Soo
- The Journal of the Korea Contents Association
- /
- v.12 no.12
- /
- pp.471-478
- /
- 2012
Various studies on sentiment classification of documents have been performed. Recently, they have been applied to twitter sentiment classification. However, they did not show good performances because they did not consider the characteristics of tweets such as tweet structure, emoticons, spelling errors, and newly-coined words. In this paper, we perform experiments on various input features (emoticon polarity, retweet polarity, author polarity, and replacement words) which affect twitter sentiment classification model based on machine-learning techniques. In the experiments with a sentiment classification model based on a support vector machine, we found that the emoticon polarity features and the author polarity features can contribute to improve the performance of a twitter sentiment classification model. Then, we found that the retweet polarity features and the replacement words features do not affect the performance of a twitter sentiment classification model contrary to our expectations.
https://doi.org/10.5392/JKCA.2012.12.12.471 인용 PDF KSCI

A study on the issue analysis of National Archives of Korea based on SNS(tweet) analysis between 2014~2015 (2014년~2015년 국가기록원 관련 트윗 이슈분석)

Seo, Ji-Won;Park, Jun-Hyeong;Oh, Hyo-Jung;Youn, Eunha
- The Korean Journal of Archival Studies
- /
- no.50
- /
- pp.139-175
- /
- 2016
This study is a content analysis on the National Archives of Korea as reflected in tweets produced between 2014 and 2015. The study thus collected all tweets that used the key word 'National Archives of Korea' from 2014 and 2015. The contents of the tweets, including their category and issues mention, were then analyzed. The results of the analysis were as follows. First, the analysis showed that the collected archives of the National Archives had increased their volume in over two years, which have a similar type and pattern in their content. Second, the tweets produced by the public reflects more current political and social issues rather than archival service.
https://doi.org/10.20923/kjas.2016.50.139 인용 PDF

Personalized Tweet Recommendation based on Ego-Network (이고-네트워크에 기반한 개인화된 트윗 추천 시스템)

Song, Sang-Chul;Hong, Jiwon;Kim, Sang-Wook
- Proceedings of the Korea Information Processing Society Conference
- /
- 2016.04a
- /
- pp.577-579
- /
- 2016
트위터 이용자 수 증가로 인해, 유저의 타임라인에 하루 새롭게 기재되는 트윗 수가 급증하는 정보과다 현상이 중요한 이슈로 자리 잡은 지 오래다. 이에 본 논문은 이고-네트워크 정보를 바탕으로 학습 된 분류 시스템을 이용해 각각의 이고 유저마다 트윗 추천에 유리한 추천 방식을 예측하고, 이를 기반으로 선호할만한 트윗을 우선적으로 선별해주는 그래프 기반 트윗 추천 시스템을 제안한다. 실험을 통하여 단일한 추천 방식보다, 최고 11.5% 추천 정확도 성능이 향상함을 확인하였다.
https://doi.org/10.3745/PKIPS.y2016m04a.577 인용 PDF

A Sentiment Analysis Tool for Korean Twitter (한국어 트위터의 감정 분석 도구)

Seo, Hyung-Won;Jeon, Kil-Ho;Choi, Myung-Gil;Nam, Yoo-Rim;Kim, Jae-Hoon
- Annual Conference on Human and Language Technology
- /
- 2011.10a
- /
- pp.94-97
- /
- 2011
본 논문은 자동으로 한글 트위터 메시지(트윗: tweet)에 포함된 감정을 분석하는 방법에 대하여 기술한다. 제안된 시스템에 의하여 수집된 트윗들은 어떤 질의에 대해 긍정 혹은 부정으로 분류된다. 이것은 일반적으로 어떤 상품을 구매하기 원하는 고객이나, 상품에 대한 고객들의 평가를 수집하기 원하는 기업에게 유용하다. 영문 트윗에 대한 연구는 이미 활발하게 진행되고 있지만 한글 트윗, 특히 감정 분류에 대한 연구는 아직 공개된 것이 없다. 수집된 트윗들은 기계 학습(Naive Bayes, Maximum Entropy, 그리고 SVM)을 이용하여 분류하였고 한글 특성에 따라 자질 선택의 기본 단위를 2음절과 3음절로 나누어 실험하였다. 기존의 영어에 대한 연구는 80% 이상의 정확도를 가지는 반면에, 본 실험에서는 60% 정도의 정확도를 얻을 수 있었다.
PDF

A Method of Classifying Tweet by subject using features (특징추출을 이용한 트위터 메시지 주제 분류 방법)

Song, Ji-min;Kim, Han-woo;Kim, Dong-joo;Jung, Sung-hoon
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2014.05a
- /
- pp.905-907
- /
- 2014
Twitter is the special place that people in the world can freely share their information and opinion. There are tries to utilize a vast amount of information made from twitter. The study on classification of tweets by subject is actively conducted. Twitter is a service for sharing information with short 140-characters text message. The short message including brief content makes extracting a variety of information hard. In the paper, we suggests the method to classify tweet by subject. The method uses both tweet and subject features. In order to conduct experiments to verify the proposed method, we collected 10,000 tweet messages with the Twitter API. Through the experimental results, we will show that the performance of our proposed method is better than those of previous methods.
PDF

Search Result 169, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)