• Title/Summary/Keyword: twitter metric

Search Result 6, Processing Time 0.019 seconds

Altmetrics: Factor Analysis for Assessing the Popularity of Research Articles on Twitter

  • Pandian, Nandhini Devi Soundara;Na, Jin-Cheon;Veeramachaneni, Bhargavi;Boothaladinni, Rashmi Vishwanath
    • Journal of Information Science Theory and Practice
    • /
    • v.7 no.4
    • /
    • pp.33-44
    • /
    • 2019
  • Altmetrics measure the frequency of references about an article on social media platforms, like Twitter. This paper studies a variety of factors that affect the popularity of articles (i.e., the number of article mentions) in the field of psychology on Twitter. Firstly, in this study, we classify Twitter users mentioning research articles as academic versus non-academic users and experts versus non-experts, using a machine learning approach. Then we build a negative binomial regression model with the number of Twitter mentions of an article as a dependant variable, and nine Twitter related factors (the number of followers, number of friends, number of status, number of lists, number of favourites, number of retweets, number of likes, ratio of academic users, and ratio of expert users) and seven article related factors (the number of authors, title length, abstract length, abstract readability, number of institutions, citation count, and availability of research funding) as independent variables. From our findings, if a research article is mentioned by Twitter users with a greater number of friends, status, favourites, and lists, by tweets with a large number of retweets and likes, and largely by Twitter users with academic and expertise knowledge on the field of psychology, the article gains more Twitter mentions. In addition, articles with a greater number of authors, title length, abstract length, and citation count, and articles with research funding get more attention from Twitter users.

Who are Tweeting Research Articles and Why?

  • Htoo, Tint Hla Hla;Na, Jin-Cheon
    • Journal of Information Science Theory and Practice
    • /
    • v.5 no.3
    • /
    • pp.48-60
    • /
    • 2017
  • The purpose of this paper is to understand the profiles of users and their motivations in sharing research articles on Twitter. The goal is to contribute to the understanding of Twitter as a new altmetric measure for assessing impact of research articles. In this paper, we extended the previous study of tweet motivations by finding out the profiles of twitter users. In particular, we examined six characteristics of users: gender, geographic distribution, academic, non-academic, individual, and organization. Out of several, we would like to highlight here three key findings. First, a great majority of users (86%) were from North America and Europe indicating the possibility that, if in general, tweets for research articles are mainly in English, Twitter as an alternative metric has a Western bias. Second, several previous altmetrics studies suggested that tweets, and altmetrics in general, do not indicate scholarly impact due to their low correlation with citation counts. This study provides further details in this aspect by revealing that most tweets (77%) were by individual users, 67% of whom were nonacademic. Therefore, tweets mostly reflect impact of research articles on the general public, rather than on academia. Finally, analysis from profiles and motivations showed that the majority of tweets (from 42% to 57%) in all user types highlighted the summary or findings of the article indicating that tweets are a new way of communicating research findings.

SEQUENTIAL MINIMAL OPTIMIZATION WITH RANDOM FOREST ALGORITHM (SMORF) USING TWITTER CLASSIFICATION TECHNIQUES

  • J.Uma;K.Prabha
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.4
    • /
    • pp.116-122
    • /
    • 2023
  • Sentiment categorization technique be commonly isolated interested in threes significant classifications name Machine Learning Procedure (ML), Lexicon Based Method (LB) also finally, the Hybrid Method. In Machine Learning Methods (ML) utilizes phonetic highlights with apply notable ML algorithm. In this paper, in classification and identification be complete base under in optimizations technique called sequential minimal optimization with Random Forest algorithm (SMORF) for expanding the exhibition and proficiency of sentiment classification framework. The three existing classification algorithms are compared with proposed SMORF algorithm. Imitation result within experiential structure is Precisions (P), recalls (R), F-measures (F) and accuracy metric. The proposed sequential minimal optimization with Random Forest (SMORF) provides the great accuracy.

A Study on Lexicon Integrated Convolutional Neural Networks for Sentiment Analysis (감성 분석을 위한 어휘 통합 합성곱 신경망에 관한 연구)

  • Yoon, Joo-Sung;Kim, Hyeon-Cheol
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.916-919
    • /
    • 2017
  • 최근 딥러닝의 발달로 인해 Sentiment analysis분야에서도 다양한 기법들이 적용되고 있다. 이미지, 음성인식 분야에서 높은 성능을 보여주었던 Convolutional Neural Networks (CNN)은 최근 자연어처리 분야에서도 활발하게 연구가 진행되고 있으며 Sentiment analysis에도 효과적인 것으로 알려져 있다. 기존의 머신러닝에서는 lexicon을 이용한 기법들이 활발하게 연구되었지만 word embedding이 등장하면서 이러한 시도가 점차 줄어들게 되었다. 그러나 lexicon은 여전히 sentiment analysis에서 유용한 정보를 제공한다. 본 연구에서는 SemEval 2017 Task4에서 제공한 Twitter dataset과 다양한 lexicon corpus를 사용하여 lexicon을 CNN과 결합하였을 때 모델의 성능이 얼마큼 향상되는지에 대하여 연구하였다. 또한 word embedding과 lexicon이 미치는 영향에 대하여 분석하였다. 모델을 평가하는 metric은 positive, negative, neutral 3가지 class에 대한 macroaveraged F1 score를 사용하였다.

Unsupervised Scheme for Reverse Social Engineering Detection in Online Social Networks (온라인 소셜 네트워크에서 역 사회공학 탐지를 위한 비지도학습 기법)

  • Oh, Hayoung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.3
    • /
    • pp.129-134
    • /
    • 2015
  • Since automatic social engineering based spam attacks induce for users to click or receive the short message service (SMS), e-mail, site address and make a relationship with an unknown friend, it is very easy for them to active in online social networks. The previous spam detection schemes only apply manual filtering of the system managers or labeling classifications regardless of the features of social networks. In this paper, we propose the spam detection metric after reflecting on a couple of features of social networks followed by analysis of real social network data set, Twitter spam. In addition, we provide the online social networks based unsupervised scheme for automated social engineering spam with self organizing map (SOM). Through the performance evaluation, we show the detection accuracy up to 90% and the possibility of real time training for the spam detection without the manager.

Financial Fraud Detection using Text Mining Analysis against Municipal Cybercriminality (지자체 사이버 공간 안전을 위한 금융사기 탐지 텍스트 마이닝 방법)

  • Choi, Sukjae;Lee, Jungwon;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.119-138
    • /
    • 2017
  • Recently, SNS has become an important channel for marketing as well as personal communication. However, cybercrime has also evolved with the development of information and communication technology, and illegal advertising is distributed to SNS in large quantity. As a result, personal information is lost and even monetary damages occur more frequently. In this study, we propose a method to analyze which sentences and documents, which have been sent to the SNS, are related to financial fraud. First of all, as a conceptual framework, we developed a matrix of conceptual characteristics of cybercriminality on SNS and emergency management. We also suggested emergency management process which consists of Pre-Cybercriminality (e.g. risk identification) and Post-Cybercriminality steps. Among those we focused on risk identification in this paper. The main process consists of data collection, preprocessing and analysis. First, we selected two words 'daechul(loan)' and 'sachae(private loan)' as seed words and collected data with this word from SNS such as twitter. The collected data are given to the two researchers to decide whether they are related to the cybercriminality, particularly financial fraud, or not. Then we selected some of them as keywords if the vocabularies are related to the nominals and symbols. With the selected keywords, we searched and collected data from web materials such as twitter, news, blog, and more than 820,000 articles collected. The collected articles were refined through preprocessing and made into learning data. The preprocessing process is divided into performing morphological analysis step, removing stop words step, and selecting valid part-of-speech step. In the morphological analysis step, a complex sentence is transformed into some morpheme units to enable mechanical analysis. In the removing stop words step, non-lexical elements such as numbers, punctuation marks, and double spaces are removed from the text. In the step of selecting valid part-of-speech, only two kinds of nouns and symbols are considered. Since nouns could refer to things, the intent of message is expressed better than the other part-of-speech. Moreover, the more illegal the text is, the more frequently symbols are used. The selected data is given 'legal' or 'illegal'. To make the selected data as learning data through the preprocessing process, it is necessary to classify whether each data is legitimate or not. The processed data is then converted into Corpus type and Document-Term Matrix. Finally, the two types of 'legal' and 'illegal' files were mixed and randomly divided into learning data set and test data set. In this study, we set the learning data as 70% and the test data as 30%. SVM was used as the discrimination algorithm. Since SVM requires gamma and cost values as the main parameters, we set gamma as 0.5 and cost as 10, based on the optimal value function. The cost is set higher than general cases. To show the feasibility of the idea proposed in this paper, we compared the proposed method with MLE (Maximum Likelihood Estimation), Term Frequency, and Collective Intelligence method. Overall accuracy and was used as the metric. As a result, the overall accuracy of the proposed method was 92.41% of illegal loan advertisement and 77.75% of illegal visit sales, which is apparently superior to that of the Term Frequency, MLE, etc. Hence, the result suggests that the proposed method is valid and usable practically. In this paper, we propose a framework for crisis management caused by abnormalities of unstructured data sources such as SNS. We hope this study will contribute to the academia by identifying what to consider when applying the SVM-like discrimination algorithm to text analysis. Moreover, the study will also contribute to the practitioners in the field of brand management and opinion mining.