Search | Korea Science

Term Frequency-Inverse Document Frequency (TF-IDF) Technique Using Principal Component Analysis (PCA) with Naive Bayes Classification

J.Uma;K.Prabha
- International Journal of Computer Science & Network Security
- /
- v.24 no.4
- /
- pp.113-118
- /
- 2024
Pursuance Sentiment Analysis on Twitter is difficult then performance it's used for great review. The present be for the reason to the tweet is extremely small with mostly contain slang, emoticon, and hash tag with other tweet words. A feature extraction stands every technique concerning structure and aspect point beginning particular tweets. The subdivision in a aspect vector is an integer that has a commitment on ascribing a supposition class to a tweet. The cycle of feature extraction is to eradicate the exact quality to get better the accurateness of the classifications models. In this manuscript we proposed Term Frequency-Inverse Document Frequency (TF-IDF) method is to secure Principal Component Analysis (PCA) with Naïve Bayes Classifiers. As the classifications process, the work proposed can produce different aspects from wildly valued feature commencing a Twitter dataset.
https://doi.org/10.22937/IJCSNS.2024.24.4.12 인용 PDF

A Study on the Spatial Patterns of Tweet Data for Urban Areas by Time - A Case of Busan City - (도시 지역 트윗 데이터의 시간대별 공간분포 특성 - 부산광역시를 사례로 -)

Ku, Cha Yong
- Journal of Cadastre & Land InformatiX
- /
- v.46 no.2
- /
- pp.269-281
- /
- 2016
The process of spatial big data, such as social media, is being paid more attention in the field of spatial information in recent years. This study, as an example of spatial big data analysis, analyzed the spatial and temporal distribution of Tweet data based on the location and time information. In addition, the characteristics of its spatial pattern by times were identified. Tweet data in Busan city are collected, processed, and analyzed to identify the characteristics of the temporal and spatial pattern. Then, the results of Tweet data analysis were compared with the characteristics of the land type. This study found that spatial pattern of tweeting in the city was associated with given time periods such as daytime and nighttime in both weekdays and weekends. The spatial distribution patterns of individual time periods were compared with the characteristics of the land for the spatially concentrated area. The results of this study showed that tweeted data would be related to different spatial distribution depending on the time, which potentially reflects the daily pattern and characteristics of the land type of urban area to some extent. This study presented the possible incorporation of social media data, e. g. Tweet data, into the field of spatial information. It is expected that there will be more advantage to use a variety of social media data in areas such as land planning and urban planning.
https://doi.org/10.22640/lxsiri.2016.46.2.269 인용 PDF KSCI

Location Inference of Twitter Users using Timeline Data (타임라인데이터를 이용한 트위터 사용자의 거주 지역 유추방법)

Kang, Ae Tti;Kang, Young Ok
- Spatial Information Research
- /
- v.23 no.2
- /
- pp.69-81
- /
- 2015
If one can infer the residential area of SNS users by analyzing the SNS big data, it can be an alternative by replacing the spatial big data researches which result from the location sparsity and ecological error. In this study, we developed the way of utilizing the daily life activity pattern, which can be found from timeline data of tweet users, to infer the residential areas of tweet users. We recognized the daily life activity pattern of tweet users from user's movement pattern and the regional cognition words that users text in tweet. The models based on user's movement and text are named as the daily movement pattern model and the daily activity field model, respectively. And then we selected the variables which are going to be utilized in each model. We defined the dependent variables as 0, if the residential areas that users tweet mainly are their home location(HL) and as 1, vice versa. According to our results, performed by the discriminant analysis, the hit ratio of the two models was 67.5%, 57.5% respectively. We tested both models by using the timeline data of the stress-related tweets. As a result, we inferred the residential areas of 5,301 users out of 48,235 users and could obtain 9,606 stress-related tweets with residential area. The results shows about 44 times increase by comparing to the geo-tagged tweets counts. We think that the methodology we have used in this study can be used not only to secure more location data in the study of SNS big data, but also to link the SNS big data with regional statistics in order to analyze the regional phenomenon.
https://doi.org/10.12672/ksis.2015.23.2.069 인용 PDF KSCI

Message Attributes, Consequences, and Values in Retweet Behavior : Based on Laddering Method (메시지 특성, 행위의 결과, 추구 가치에 기반한 리트윗 행위 : 래더링 기법을 이용한 탐색적 연구)

Kim, Hyo
- The Journal of the Korea Contents Association
- /
- v.13 no.3
- /
- pp.131-140
- /
- 2013
Assuming that roles of traditional mass media are also shown in Twitter services, the study aims at exploring Twitter users' motives and rationales in re-tweet behavior. Based on the laddering interview method, the study gathers data on (1) message attributes (what kinds of messages do you re-tweet?); (2) consequences (what kinds of consequences are you expecting when you re-tweet?); and (3) values (what are the ultimate values in your re-tweet behavior?). The most repetitive value occurring in participants' retweet was feeling "sympathy" and "sharing" rationales. For such rationales, participants oftentimes utilize messages with "agenda" and "information" that are relative to themselves. Messages with "helping" to help others also frequently showed up in their retweet rationales. Known as liberalists' rationales, "communal consciousness", and "calling for others' action" are also shown, but not as frequent as "feeling sympathy and sharing. A total of 48 items from the analyses were used in a subsequent study as variables to identify factors (dimensions) of retweet motivation.
https://doi.org/10.5392/JKCA.2013.13.03.131 인용 PDF KSCI

A Design of Smart Retweet Supporting the Efficient Information Transfer (효과적인 정보전달을 지원하는 스마트 리트윗의 설계)

Jeong, Do-Seong;Cho, Dae-Soo
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2011.05a
- /
- pp.252-255
- /
- 2011
Growing demand for smart phones and data communication diminishes the constraints of Twitter and Facebook than a smartphone has become a subject of interest. On the other hand facebook users in their relationships to obtain the consent of the other, twitter is a relatively simple procedure for the information ripple effect is excellent. Twitter is beyond a simple social networking services(SNS) located in one of the popular media and powerful have the upper retweet. Retweet to the top of his sympathy with the ability th send tweets to their subscriber information can spread quickly. In this paper, we propose the smart retweet that system actively extend the existing retweet. In order to realize the smart retweet and additional criteria for determining the destination of the information is required. Based on tweet generated regional or an local information mentioned to tweet, to determine the destination. Smart retweet of the speed and scope of information transmission through the scale is expected.
PDF

An Analysis of Relationship Between Word Frequency in Social Network Service Data and Crime Occurences (소셜 네트워크 서비스의 단어 빈도와 범죄 발생과의 관계 분석)

Kim, Yong-Woo;Kang, Hang-Bong
- KIPS Transactions on Computer and Communication Systems
- /
- v.5 no.9
- /
- pp.229-236
- /
- 2016
In the past, crime prediction methods utilized previous records to accurately predict crime occurrences. Yet these crime prediction models had difficulty in updating immense data. To enhance the crime prediction methods, some approaches used social network service (SNS) data in crime prediction studies, but the relationship between SNS data and crime records has not been studied thoroughly. Hence, in this paper, we analyze the relationship between SNS data and criminal occurrences in the perspective of crime prediction. Using Latent Dirichlet Allocation (LDA), we extract tweets that included any words regarding criminal occurrences and analyze the changes in tweet frequency according to the crime records. We then calculate the number of tweets including crime related words and investigate accordingly depending on crime occurrences. Our experimental results demonstrate that there is a difference in crime related tweet occurrences when criminal activity occurs. Moreover, our results show that SNS data analysis will be helpful in crime prediction model as there are certain patterns in tweet occurrences before and after the crime.
https://doi.org/10.3745/KTCCS.2016.5.9.229 인용 PDF KSCI

Spread of Negative Word-of-mouth of Manufacturing Companies Via Twitter: From the Supply Chain Risk's Perspective (트위터를 통한 제조 기업의 부정적 구전 확산: 공급사슬 리스크 관점에서)

Jeong, EuiBeom;Yoo, Hanna
- Journal of Korea Society of Industrial Information Systems
- /
- v.26 no.5
- /
- pp.79-94
- /
- 2021
Despite the importance of the supply chain risk due to the negative word-of-mouth (NWOM) in social media, related research is insufficient. Thus, this study analyzes how the NWOM of the product is distributed through social media and the characteristics of the distributor based on social exchange theory. For this purpose, we collected information on car recalls from four companies using Twitter from the National Highway Traffic Safety Administration (NHTSA). Based on the Seed Tweet, a Re-Tweet (RT) network was constructed to examine the distribution and spread of NWOM, and regression analysis was performed to test the hypothesis. As a result, it was confirmed that NWOM is a small world network structure that spreads around hub users connected to many users. Moreover, it was found that the more interactive and reciprocal relations the first distributor has, the greater the speed and scale of distribution of NWOM.
https://doi.org/10.9723/jksiis.2021.26.5.007 인용 PDF KSCI

Changes in public recognition of parabens on twitter and the research status of parabens related to toothpaste (트위터(twitter)에서의 파라벤(parabens) 관련 대중의 인식 변화와 치약내 파라벤에 대한 연구 현황)

Oh, Hyo-Jung;Jeon, Jae-Gyu
- Journal of Korean Academy of Oral Health
- /
- v.41 no.2
- /
- pp.154-161
- /
- 2017
Objectives: The purpose of this study was to investigate changes in public recognition of parabens on Twitter and the research status of parabens related to toothpaste. Methods: Tweet information between 2010 and October 2016 was collected by an automatic web crawler and examined according to tweet frequency, key words (2012-October 2016), and issue tweet detection analyses to reveal changes in public recognition of parabens on Twitter. To investigate the research status of parabens related to toothpaste, queries such as "paraben," "paraben and toxicity," "paraben and (toothpastes or dentifrices)," and "paraben and (toothpastes or dentifrices) and toxicity" were used. Results: The number of tweets concerning parabens sharply increased when parabens in toothpaste emerged as a social issue (October 2014), and decreased from 2015 onward. However, toothpaste and its related terms were continuously included in the core key words extracted from tweets from 2015. They were not included in key words before 2014, indicating that the emergence of parabens in toothpaste as a social issue plays an important role in public recognition of parabens in toothpaste. The issue tweet analysis also confirmed the change in public recognition of parabens in toothpaste. Despite the expansion of public recognition of parabens in toothpaste, there are only seven research articles on the topic in PubMed. Conclusions: The general public clearly recognized parabens in toothpaste after emergence of parabens in toothpaste as a social issue. Nevertheless, the scientific information on parabens in toothpaste is very limited, suggesting that the efforts of dental scientists are required to expand scientific knowledge related to parabens in oral hygiene measures.
https://doi.org/10.11149/jkaoh.2017.41.2.154 인용

A Content Analysis on the Domestic Public Libraries' Use of Twitter (국내 공공도서관의 트위터 이용에 관한 내용분석)

Shim, Jiyoung
- Journal of the Korean Society for information Management
- /
- v.34 no.1
- /
- pp.241-262
- /
- 2017
This study aims to identify and analyze the Twitter use of domestic public libraries. In order to identify the detailed patterns of Twitter use in library and information services, a content analysis was conducted for the 3,038 tweet data from the top 14 public libraries' accounts on Twitter use. Inductive approach was adopted to develop a coding scheme and open coding was conducted with the entire tweet. Additionally, correspondence analysis was conducted for the result of content analysis to identify how library accounts correspond to specific types. As a result, 3 main categories and 9 sub-categories of public libraries' Twitter use were developed. And the 37 detailed patterns of public libraries' use of Twitter were identified. The identified patterns can provide the libraries interested in Twitter use with guidelines.
https://doi.org/10.3743/KOSIM.2017.34.1.241 인용 PDF KSCI

Dynamic Seed Selection for Twitter Data Collection (트위터 데이터 수집을 위한 동적 시드 선택)

Lee, Hyoenchoel;Byun, Changhyun;Kim, Yanggon;Lee, Sang Ho
- Journal of KIISE:Databases
- /
- v.41 no.4
- /
- pp.217-225
- /
- 2014
Analysis of social media such as Twitter can yield interesting perspectives to understanding human behavior, detecting hot issues, identifying influential people, or discovering a group and community. However, it is difficult to gather the data relevant to specific topics due to the main characteristics of social media data; data is large, noisy, and dynamic. This paper proposes a new algorithm that dynamically selects the seed nodes to efficiently collect tweets relevant to topics. The algorithm utilizes attributes of users to evaluate the user influence, and dynamically selects the seed nodes during the collection process. We evaluate the proposed algorithm with real tweet data, and get satisfactory performance results.
KSCI

Search Result 117, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)