• Title/Summary/Keyword: Blog Post

Search Result 29, Processing Time 0.028 seconds

Predicting the Popularity of Post Articles with Virtual Temperature in Web Bulletin (웹게시판에서 가상온도를 이용한 게시글의 인기 예측)

  • Kim, Su-Do;Kim, So-Ra;Cho, Hwan-Gue
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.10
    • /
    • pp.19-29
    • /
    • 2011
  • A Blog provides commentary, news, or content on a particular subject. The important part of many blogs is interactive format. Sometimes, there is a heated debate on a topic and any article becomes a political or sociological issue. In this paper, we proposed a method to predict the popularity of an article in advance. First, we used hit count as a factor to predict the popularity of an article. We defined the saturation point and derived a model to predict the hit count of the saturation point by a correlation coefficient of the early hit count and hit count of the saturation point. Finally, we predicted the virtual temperature of an article using 4 types(explosive, hot, warm, cold). We can predict the virtual temperature of Internet discussion articles using the hit count of the saturation point with more than 70% accuracy, exploiting only the first 30 minutes' hit count. In the hot, warm, and cold categories, we can predict more than 86% accuracy from 30 minutes' hit count and more than 90% accuracy from 70 minutes' hit count.

A Study on the Perception Change of Bats after COVID-19 by Social Media Data Analysis (소셜미디어 데이터 분석을 활용한 COVID-19 전후 박쥐의 인식변화 연구)

  • Lee, Jukyung;Kim, Byeori;Kim, Sun-Sook
    • Journal of Environmental Impact Assessment
    • /
    • v.31 no.5
    • /
    • pp.310-320
    • /
    • 2022
  • This study aimed to identify the change in the public perception of "bats" after the outbreak of the coronavirus (COVID-19) infection. Text mining and network analysis were conducted for blog posts, the largest social network in Korea. We collected 9,241 Naver blog posts from 2019 to 2020 just before the outbreak of COVID-19 in Korea. The data were analyzed with Python and NetMiner 4.3.2, and the public's perception of bats was examined through the relationship of keywords by period. Findings indicated that the frequency of bat keywords in 2020 increased more than 25 times compared to 2019, and the centrality value increased more than three times. The perception of bats changed before and after the outbreak of the pandemic. Prior to COVID-19, bats were highly recognized as a species of wildlife while in the first half of 2020, they were strongly considered as a threat to human society in relation to infectious diseases and health. In the second half of 2020, it was confirmed that the area of interest in bats expanded as the proportion of ecological and cultural types ofresearch increased. This study seeks to contribute to the expansion and direction of future research in bats by understanding the public's interest in the potential impact of the species as disease hosts post the COVID-19 pandemic.

Outdoor Healing Places Perception Analysis Using Named Entity Recognition of Social Media Big Data (소셜미디어 빅데이터의 개체명 인식을 활용한 옥외 힐링 장소 인식 분석)

  • Sung, Junghan;Lee, Kyungjin
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.50 no.5
    • /
    • pp.90-102
    • /
    • 2022
  • In recent years, as interest in healing increases, outdoor spaces with the concept of healing have been created. For more professional and in-depth planning and design, the perception and characteristics of outdoor healing places through social media posts were analyzed using NER. Text mining was conducted using 88,155 blog posts, and frequency analysis and clique cohesion analysis were conducted. Six elements were derived through a literature review, and two elements were added to analyze the perception and the characteristics of healing places. As a result, visitors considered place elements, date and time, social elements, and activity elements more important than personnel, psychological elements, plants and color, and form and shape when visiting healing places. The analysis allowed the derivation of perceptions and characteristics of healing places through keywords. From the results of the Clique, keywords, such as places, date and time, and relationship, were clustered, so it was possible to know where, when, what time, and with whom people were visiting places for healing. Through the study, the perception and characteristics of healing places were derived by analyzing large-scale data written by visitors. It was confirmed that specific elements could be used in planning and marketing.

A Study on the Demand for Cultural Ecosystem Services in Urban Forests Using Topic Modeling (토픽모델링을 활용한 도시림의 문화서비스 수요 특성 분석)

  • Kim, Jee-Young;Son, Yong-Hoon
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.50 no.4
    • /
    • pp.37-52
    • /
    • 2022
  • The purpose of this study is to analyze the demand for cultural ecosystem services in urban forests based on user perception and experience value by using Naver blog posts and LDA topic modeling. Bukhansan National Park was used to analyze and review the feasibility of spatial assessments. Based on the results of topic modeling from blog posts, a review process was conducted considering the relevance of Bukhansan National Park's cultural services and its suitability as a spatial assessment case, and finally, an index for the spatial assessment of urban forest's cultural service was derived. Specifically, 21 topics derived through topic analysis were interpreted, and 13 topics related to cultural ecosystem services were derived based on the MA(Millennium Ecosystem Assessment)'s classification system for ecosystem services. 72.7% of all documents reviewed had data deemed useful for this study. The contents of the topic fell into one of the seven types of cultural services related to "mountainous recreation activities" (23.7%), "indirect use value linked to tourism and convenience facilities" (12.4%), "inspirational activities" (11.2%), "seasonal recreation activities" (6.2%), "natural appreciation and static recreation activities" (3.7%). Next, for the 13 cultural service topics derived from data gathered about Bukhansan National Park, the possibility of spatial assessment of the characteristics of cultural ecosystem services provided by urban forests was reviewed, and a total of 8 cultural service indicators were derived. The MA's cultural service classification system for ecosystem services, which was widely used in previous studies, has limitations in that it does not reflect the actual user demand of urban forests, but it is meaningful in that it categorizes cultural service indicators suitable for domestic circumstances. In addition, the study is significant as it presented a methodology to interpret and derive the demand for cultural services using a large amount of user awareness and experience data.

Effect of Online Word of Mouth on Product Sales: Focusing on Communication-Channel Characteristics

  • Jeon, Jaihyun;Lim, Taewook;Kim, Byung-Do;Seok, Junhee
    • Asia Marketing Journal
    • /
    • v.21 no.2
    • /
    • pp.73-98
    • /
    • 2019
  • As information and communication technology continue its remarkable development, the exchange of information online becomes as prevalent and frequent as face-to-face communication in daily life. Therefore, the management and application of WOM (word of mouth) practices will become more important than ever to companies. Currently, there are various types of communication channels for online WOM, and each channel has its own unique traits. Most of the previous research studies online WOM by examining the information inside a single communication channel, but this research chooses two different communication channels and analyzes the effects of online WOM with each channel's unique characteristics. More specifically, this research focuses on the expectation that the effects of information from Twitter and blogs on product sales may differ because Twitter and blogs, two different communication channels for online WOM, have their own unique traits. Our particular aim is to perform an in-depth examination on the effects of communication channel's volume and valence on product sales, two important attributes of online WOM. Furthermore, while most of the empirical research focuses on online WOM and analyzes its effect on markets of temporary experience goods, such as movies and books, this research highlights focuses on the automobile market, a durable goods market. The results of our analysis are as follows: First, regarding blogs, a positive valence significantly and positively affects the sales of products, and this result indicates that consumers are influenced more by the emotional aspect of a product presented in a post than by the number of blog posts. Second, regarding Twitter, the volume of online WOM significantly and positively affects sales, an indication that as the number of posts increase, the sales increase. Through this research, we suggest that even those firms that sell durable goods can increase sales through the management and application of online WOM. Moreover, according to the characteristics of communication channels, the effects of online WOM on sales differ. As a practical implication of this research, we suggest that companies can and should create marketing strategies appropriate to their targeted communication channels.

Study on the Science & Technology Information Service Needs Corresponding to the Scientists and Engineers Group Characteristics (사용자 그룹별 과학기술정보 서비스 수요 분석)

  • Jung, Hye-Ju;Yoon, Jungsun
    • Journal of Information Management
    • /
    • v.43 no.4
    • /
    • pp.143-167
    • /
    • 2012
  • In this study, survey analysis was conducted to determine the demands of science & technology information service by the groups of users. The questionnaire was composed of the need for 20 services in the science & technology information, the need for personal information to people-to-people exchanges, and information that can be shared with others. KOSEN users 1,013 people participated in the survey, and the analysis of variance was conducted depending on institution, profession, final degree and the age of the respondents. Results of frequency analysis, there were in high demands for trend analysis, papers, research reports, patents, knowledge queries, project announcements, jobs, experimental methods, information society and study abroad/Post-doc information, and all services except mentoring, community and blog were appeared to have the significant differences depending on the groups of users. Also the personal information deemed to be necessary for interaction with others was resulted in specialization, thesis/research performances, career, organization, jobs, final degree and education in order, there were partially difference depending on the user's groups. In addition, 97% of respondents had their own scientific and technical information to be shared with other people in order of papers, presentations (ppt), reports, experimental methods and the images. The results of this study can be used as useful information for scientists and engineers to develop a user-centered personalized services and are expected to be helpful to set the direction of science information services in the future.

Term Mapping Methodology between Everyday Words and Legal Terms for Law Information Search System (법령정보 검색을 위한 생활용어와 법률용어 간의 대응관계 탐색 방법론)

  • Kim, Ji Hyun;Lee, Jong-Seo;Lee, Myungjin;Kim, Wooju;Hong, June Seok
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.137-152
    • /
    • 2012
  • In the generation of Web 2.0, as many users start to make lots of web contents called user created contents by themselves, the World Wide Web is overflowing by countless information. Therefore, it becomes the key to find out meaningful information among lots of resources. Nowadays, the information retrieval is the most important thing throughout the whole field and several types of search services are developed and widely used in various fields to retrieve information that user really wants. Especially, the legal information search is one of the indispensable services in order to provide people with their convenience through searching the law necessary to their present situation as a channel getting knowledge about it. The Office of Legislation in Korea provides the Korean Law Information portal service to search the law information such as legislation, administrative rule, and judicial precedent from 2009, so people can conveniently find information related to the law. However, this service has limitation because the recent technology for search engine basically returns documents depending on whether the query is included in it or not as a search result. Therefore, it is really difficult to retrieve information related the law for general users who are not familiar with legal terms in the search engine using simple matching of keywords in spite of those kinds of efforts of the Office of Legislation in Korea, because there is a huge divergence between everyday words and legal terms which are especially from Chinese words. Generally, people try to access the law information using everyday words, so they have a difficulty to get the result that they exactly want. In this paper, we propose a term mapping methodology between everyday words and legal terms for general users who don't have sufficient background about legal terms, and we develop a search service that can provide the search results of law information from everyday words. This will be able to search the law information accurately without the knowledge of legal terminology. In other words, our research goal is to make a law information search system that general users are able to retrieval the law information with everyday words. First, this paper takes advantage of tags of internet blogs using the concept for collective intelligence to find out the term mapping relationship between everyday words and legal terms. In order to achieve our goal, we collect tags related to an everyday word from web blog posts. Generally, people add a non-hierarchical keyword or term like a synonym, especially called tag, in order to describe, classify, and manage their posts when they make any post in the internet blog. Second, the collected tags are clustered through the cluster analysis method, K-means. Then, we find a mapping relationship between an everyday word and a legal term using our estimation measure to select the fittest one that can match with an everyday word. Selected legal terms are given the definite relationship, and the relations between everyday words and legal terms are described using SKOS that is an ontology to describe the knowledge related to thesauri, classification schemes, taxonomies, and subject-heading. Thus, based on proposed mapping and searching methodologies, our legal information search system finds out a legal term mapped with user query and retrieves law information using a matched legal term, if users try to retrieve law information using an everyday word. Therefore, from our research, users can get exact results even if they do not have the knowledge related to legal terms. As a result of our research, we expect that general users who don't have professional legal background can conveniently and efficiently retrieve the legal information using everyday words.

Analysis of the Time-dependent Relation between TV Ratings and the Content of Microblogs (TV 시청률과 마이크로블로그 내용어와의 시간대별 관계 분석)

  • Choeh, Joon Yeon;Baek, Haedeuk;Choi, Jinho
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.163-176
    • /
    • 2014
  • Social media is becoming the platform for users to communicate their activities, status, emotions, and experiences to other people. In recent years, microblogs, such as Twitter, have gained in popularity because of its ease of use, speed, and reach. Compared to a conventional web blog, a microblog lowers users' efforts and investment for content generation by recommending shorter posts. There has been a lot research into capturing the social phenomena and analyzing the chatter of microblogs. However, measuring television ratings has been given little attention so far. Currently, the most common method to measure TV ratings uses an electronic metering device installed in a small number of sampled households. Microblogs allow users to post short messages, share daily updates, and conveniently keep in touch. In a similar way, microblog users are interacting with each other while watching television or movies, or visiting a new place. In order to measure TV ratings, some features are significant during certain hours of the day, or days of the week, whereas these same features are meaningless during other time periods. Thus, the importance of features can change during the day, and a model capturing the time sensitive relevance is required to estimate TV ratings. Therefore, modeling time-related characteristics of features should be a key when measuring the TV ratings through microblogs. We show that capturing time-dependency of features in measuring TV ratings is vitally necessary for improving their accuracy. To explore the relationship between the content of microblogs and TV ratings, we collected Twitter data using the Get Search component of the Twitter REST API from January 2013 to October 2013. There are about 300 thousand posts in our data set for the experiment. After excluding data such as adverting or promoted tweets, we selected 149 thousand tweets for analysis. The number of tweets reaches its maximum level on the broadcasting day and increases rapidly around the broadcasting time. This result is stems from the characteristics of the public channel, which broadcasts the program at the predetermined time. From our analysis, we find that count-based features such as the number of tweets or retweets have a low correlation with TV ratings. This result implies that a simple tweet rate does not reflect the satisfaction or response to the TV programs. Content-based features extracted from the content of tweets have a relatively high correlation with TV ratings. Further, some emoticons or newly coined words that are not tagged in the morpheme extraction process have a strong relationship with TV ratings. We find that there is a time-dependency in the correlation of features between the before and after broadcasting time. Since the TV program is broadcast at the predetermined time regularly, users post tweets expressing their expectation for the program or disappointment over not being able to watch the program. The highly correlated features before the broadcast are different from the features after broadcasting. This result explains that the relevance of words with TV programs can change according to the time of the tweets. Among the 336 words that fulfill the minimum requirements for candidate features, 145 words have the highest correlation before the broadcasting time, whereas 68 words reach the highest correlation after broadcasting. Interestingly, some words that express the impossibility of watching the program show a high relevance, despite containing a negative meaning. Understanding the time-dependency of features can be helpful in improving the accuracy of TV ratings measurement. This research contributes a basis to estimate the response to or satisfaction with the broadcasted programs using the time dependency of words in Twitter chatter. More research is needed to refine the methodology for predicting or measuring TV ratings.

Financial Fraud Detection using Text Mining Analysis against Municipal Cybercriminality (지자체 사이버 공간 안전을 위한 금융사기 탐지 텍스트 마이닝 방법)

  • Choi, Sukjae;Lee, Jungwon;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.119-138
    • /
    • 2017
  • Recently, SNS has become an important channel for marketing as well as personal communication. However, cybercrime has also evolved with the development of information and communication technology, and illegal advertising is distributed to SNS in large quantity. As a result, personal information is lost and even monetary damages occur more frequently. In this study, we propose a method to analyze which sentences and documents, which have been sent to the SNS, are related to financial fraud. First of all, as a conceptual framework, we developed a matrix of conceptual characteristics of cybercriminality on SNS and emergency management. We also suggested emergency management process which consists of Pre-Cybercriminality (e.g. risk identification) and Post-Cybercriminality steps. Among those we focused on risk identification in this paper. The main process consists of data collection, preprocessing and analysis. First, we selected two words 'daechul(loan)' and 'sachae(private loan)' as seed words and collected data with this word from SNS such as twitter. The collected data are given to the two researchers to decide whether they are related to the cybercriminality, particularly financial fraud, or not. Then we selected some of them as keywords if the vocabularies are related to the nominals and symbols. With the selected keywords, we searched and collected data from web materials such as twitter, news, blog, and more than 820,000 articles collected. The collected articles were refined through preprocessing and made into learning data. The preprocessing process is divided into performing morphological analysis step, removing stop words step, and selecting valid part-of-speech step. In the morphological analysis step, a complex sentence is transformed into some morpheme units to enable mechanical analysis. In the removing stop words step, non-lexical elements such as numbers, punctuation marks, and double spaces are removed from the text. In the step of selecting valid part-of-speech, only two kinds of nouns and symbols are considered. Since nouns could refer to things, the intent of message is expressed better than the other part-of-speech. Moreover, the more illegal the text is, the more frequently symbols are used. The selected data is given 'legal' or 'illegal'. To make the selected data as learning data through the preprocessing process, it is necessary to classify whether each data is legitimate or not. The processed data is then converted into Corpus type and Document-Term Matrix. Finally, the two types of 'legal' and 'illegal' files were mixed and randomly divided into learning data set and test data set. In this study, we set the learning data as 70% and the test data as 30%. SVM was used as the discrimination algorithm. Since SVM requires gamma and cost values as the main parameters, we set gamma as 0.5 and cost as 10, based on the optimal value function. The cost is set higher than general cases. To show the feasibility of the idea proposed in this paper, we compared the proposed method with MLE (Maximum Likelihood Estimation), Term Frequency, and Collective Intelligence method. Overall accuracy and was used as the metric. As a result, the overall accuracy of the proposed method was 92.41% of illegal loan advertisement and 77.75% of illegal visit sales, which is apparently superior to that of the Term Frequency, MLE, etc. Hence, the result suggests that the proposed method is valid and usable practically. In this paper, we propose a framework for crisis management caused by abnormalities of unstructured data sources such as SNS. We hope this study will contribute to the academia by identifying what to consider when applying the SVM-like discrimination algorithm to text analysis. Moreover, the study will also contribute to the practitioners in the field of brand management and opinion mining.