• Title/Summary/Keyword: news text

Search Result 380, Processing Time 0.026 seconds

Linking Findings from Text Analyses to Online Sales Strategies (온라인상의 기업 및 소비자 텍스트 분석과 이를 활용한 온라인 매출 증진 전략)

  • Kim, Jeeyeon;Jo, Wooyong;Choi, Jeonghye;Chung, Yerim
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.41 no.2
    • /
    • pp.81-100
    • /
    • 2016
  • Much effort has been exerted to analyze online texts and understand how empirical results can help improve sales performance. In this research, we aim to extend this stream of research by decomposing online texts based on text sources, namely, companies and consumers. To be specific, we investigate how online texts driven by companies differ from those generated by consumers, and the extent to which both types of online texts have different effects on online sales. We obtained sales data from one of the biggest game publishers and merged them with online texts provided by companies using news articles and those created by consumers in user communities. The empirical analyses yield the following findings. Word visualization and topic analyses show that firms and consumers generate different contexts. Specifically, companies spread word to promote their own events whereas consumers produce online words to share winning strategies. Moreover, online sales are influenced by consumer-generated community topics whereas firm-driven topics in news articles have little to no effect. These findings suggest that companies should focus more on online texts generated by consumers rather than spreading their own words. Moreover, online sales strategies should take advantage of specific topics that have been proven to increase online sales. In particular, these findings give startup companies and small business owners in variety of industries the advantage when they use the online channel for distribution and as a marketing platform.

Trend Analysis of News Articles Regarding Sungnyemun Gate using Text Mining (텍스트마이닝을 활용한 숭례문 관련 기사의 트렌드 분석)

  • Kim, Min-Jeong;Kim, Chul Joo
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.3
    • /
    • pp.474-485
    • /
    • 2017
  • Sungnyemun Gate, Korea's National Treasure No.1, was destroyed by fire on February 10, 2008 and has been re-opened to the public again as of May 4, 2013 after a reconstruction work. Sungnyemun Gate become a national issue and draw public attention to be a major topic on news or research. In this research, text mining and association rule mining techniques were used on keyword of newspaper articles related to Sungnyemun Gate as a cultural heritage from 2002 to 2016 to find major keywords and keyword association rule. Next, we analyzed some typical and specific keywords that appear frequently and partially depending on before and after the fire and newpaper companies. Through this research, the trends and keywords of newspapers articles related to Sungnyemun Gate could be understood, and this research can be used as fundamental data about Sungnyemun Gate to information producer and consumer.

A Topic Modeling Analysis for Online News Article Comments on Nurses' Workplace Bullying (간호사의 직장 내 괴롭힘 관련 온라인 뉴스기사 댓글에 대한 토픽 모델링 분석)

  • Kang, Jiyeon;Kim, Soogyeong;Roh, Seungkook
    • Journal of Korean Academy of Nursing
    • /
    • v.49 no.6
    • /
    • pp.736-747
    • /
    • 2019
  • Purpose: This study aimed to explore public opinion on workplace bullying in the nursing field, by analyzing the keywords and topics of online news comments. Methods: This was a text-mining study that collected, processed, and analyzed text data. A total of 89,951 comments on 650 online news articles, reported between January 1, 2013 and July 31, 2018, were collected via web crawling. The collected unstructured text data were preprocessed and keyword analysis and topic modeling were performed using R programming. Results: The 10 most important keywords were "work" (37121.7), "hospital" (25286.0), "patients" (24600.8), "woman" (24015.6), "physician" (20840.6), "trouble" (18539.4), "time" (17896.3), "money" (16379.9), "new nurses" (14056.8), and "salary" (13084.1). The 22,572 preprocessed key words were categorized into four topics: "poor working environment", "culture among women", "unfair oppression", and "society-level solutions". Conclusion: Public interest in workplace bullying among nurses has continued to increase. The public agreed that negative work environment and nursing shortage could cause workplace bullying. They also considered nurse bullying as a problem that should be resolved at a societal level. It is necessary to conduct further research through gender discrimination perspectives on nurse workplace bullying and the social value of nursing work.

Features of the Rural Revitalization Projects in Jang-su County Using LDA Topic Analysis of News Data - Focused on Keyword of Tourism and Livelihood - (뉴스데이터의 LDA 토픽 분석을 통한 장수군 농촌지역 활성화 사업의 특징 - 관광·생활 키워드를 중심으로 -)

  • Kim, Young-Jin;Son, Yong-hoon
    • Journal of Korean Society of Rural Planning
    • /
    • v.24 no.4
    • /
    • pp.69-80
    • /
    • 2018
  • In this study, we typified the project for revitalizing the rural area through text analysis using news data, and analyzed the main direction and characteristics of the project. In order to examine the factors emphasized among the issues related to the revitalization of rural areas, we used news data related to 'tourism' and 'livelihood', which are the main keyword of the project to promote rural areas. In the analysis, text mining techniques were used. Topic modeling was conducted on LDA techniques for major projects in 'tourism' and 'livelihood' keyword. Based on this, this study typified the projects that are carried out for the activation of rural areas by topic. As a result of the analysis, it was fount that the topics included in the project were distributed in 11 sub-types(Tourism Promotion, Regional Specialization, Local Festival, Development of Regional Scale, Urban and Rural Exchange, Agricultural Support, Community Forest Management, Improve the Settlement Environment, General Welfare Service, Low Class Support, Others). The characteristics of the rural revitalization projects were examined, and it was confirmed that domestic projects were carried out by tourism-oriented projects. To summarize, the government is making projects to revitalize rural areas through related ministries. Within the structure where the project is spreading to the region, a lot of projects are being carried out. It is understood that the tourism and welfare oriented projects are being carried out in the revitalization project of the domestic rural area. Therefore, in order to achieve the goal of rural revitalization, it is believed that it will be effective to carry out a balanced project to improve the settlement environment of the residents.

Topic Analysis of the "Right to be Forgotten" Using Text Mining (텍스트마이닝을 활용한 "잊힐 권리"의 토픽 분석)

  • Lee, So-Hyun;Koo, Bon-Jin
    • Journal of the Korean Society for information Management
    • /
    • v.39 no.2
    • /
    • pp.275-298
    • /
    • 2022
  • This study examined the issues and characteristics that appeared in news and journal articles related to the 'right to be forgotten' using text mining analysis. Data for analysis were collected from 2010 to 2020 with the keyword 'right to be forgotten'. Keyword analysis and topic modeling analysis were performed on the collected data. As a result, in the last 10 years the issues about 'right to be forgotten' are not much different in news and journal articles and the approaches also are similar. However, it confirmed common issues and the partial difference between news and journal articles through comparison. Therefore in Archives and Records Management Studies, it is necessary to discuss derived in this study. In particular common issues are considered first but if there are differences in issues, it is needed to discuss them in various ways. This study is meaningful to understand the meaning and to draw issues that may arise in the future of the 'right to be forgotten'. The results of this study will contribute to be variously discussed on the 'right to be forgotten' in Archives and Records Management Studies.

Big Data Analysis on the Perception of Home Training According to the Implementation of COVID-19 Social Distancing

  • Hyun-Chang Keum;Kyung-Won Byun
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.15 no.3
    • /
    • pp.211-218
    • /
    • 2023
  • Due to the implementation of COVID-19 distancing, interest and users in 'home training' are rapidly increasing. Therefore, the purpose of this study is to identify the perception of 'home training' through big data analysis on social media channels and provide basic data to related business sector. Social media channels collected big data from various news and social content provided on Naver and Google sites. Data for three years from March 22, 2020 were collected based on the time when COVID-19 distancing was implemented in Korea. The collected data included 4,000 Naver blogs, 2,673 news, 4,000 cafes, 3,989 knowledge IN, and 953 Google channel news. These data analyzed TF and TF-IDF through text mining, and through this, semantic network analysis was conducted on 70 keywords, big data analysis programs such as Textom and Ucinet were used for social big data analysis, and NetDraw was used for visualization. As a result of text mining analysis, 'home training' was found the most frequently in relation to TF with 4,045 times. The next order is 'exercise', 'Homt', 'house', 'apparatus', 'recommendation', and 'diet'. Regarding TF-IDF, the main keywords are 'exercise', 'apparatus', 'home', 'house', 'diet', 'recommendation', and 'mat'. Based on these results, 70 keywords with high frequency were extracted, and then semantic indicators and centrality analysis were conducted. Finally, through CONCOR analysis, it was clustered into 'purchase cluster', 'equipment cluster', 'diet cluster', and 'execute method cluster'. For the results of these four clusters, basic data on the 'home training' business sector were presented based on consumers' main perception of 'home training' and analysis of the meaning network.

A Study on News Graphic Design in Social Media (온라인 인포그래픽 뉴스의 커뮤니케이션에 관한 연구)

  • Won, Jongyoun
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.12
    • /
    • pp.57-67
    • /
    • 2019
  • The way people read news is changing, from print to screen. In this study, we aimed to understand the impact of the use of infographics in news on readers. According to a study conducted by Reuters Research Institute at the University of Oxford in 2017, the proportion of online consumers of news is steadily increasing, with over 51 per cent of Americans receiving news via social media. Additionally, newspaper subscription rates are rapidly declining. According to previous studies, the understanding of text information is higher in print media than on screen. Therefore, to compensate for the weaknesses in the understanding of online news, online news media are providing infographic news services to deliver good news. Therefore, this study attempted to understand the impact of using infographics in the news. To this end, three experiments were conducted. The findings from the study indicate that the use of infographics in news has a positive effect on users in terms of the variables measured, including cognitive effect and acceptance of news. As compared with print news, on-screen news was not as effective in terms of comprehension. However, we propose interactive infographics to enhance communication effect along with improved design.

An XML-based Multimedia News Management System (XML 기반 멀티미디어 뉴스 관리 시스템)

  • Kim Hyon Hee;Park Seung Soo
    • The KIPS Transactions:PartB
    • /
    • v.11B no.7 s.96
    • /
    • pp.785-792
    • /
    • 2004
  • With recent progress of related multimedia computing technologies, it is necessay to retrieve diverse types of multimedia data based on multi-media content and their relationships. However, different from alphanumeric data, it is difficult to provide relevant multimedia information, be-cause multimedia contents and their relationships are implied in multimedia data. Therefore, in case of a multimedia news service system that is a representative multimedia application, most of new services provide relevant news about text articles and retrieval of multimedia news such as video news or image news are provided independently. In this paper, we present an XML-based multimedia news management system, which provides integrating, retrieval, and delivery of relevant multimedia news. Our data model composed of media object, relationship object, and view object represents diverse types of multimedia news content and semantically related multimedia news. In addition, a proposed view mechanism makes it possible to customize multimedia news, and therefore provides multimedia news efficiently.

Stock Price Prediction by Utilizing Category Neutral Terms: Text Mining Approach (카테고리 중립 단어 활용을 통한 주가 예측 방안: 텍스트 마이닝 활용)

  • Lee, Minsik;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.123-138
    • /
    • 2017
  • Since the stock market is driven by the expectation of traders, studies have been conducted to predict stock price movements through analysis of various sources of text data. In order to predict stock price movements, research has been conducted not only on the relationship between text data and fluctuations in stock prices, but also on the trading stocks based on news articles and social media responses. Studies that predict the movements of stock prices have also applied classification algorithms with constructing term-document matrix in the same way as other text mining approaches. Because the document contains a lot of words, it is better to select words that contribute more for building a term-document matrix. Based on the frequency of words, words that show too little frequency or importance are removed. It also selects words according to their contribution by measuring the degree to which a word contributes to correctly classifying a document. The basic idea of constructing a term-document matrix was to collect all the documents to be analyzed and to select and use the words that have an influence on the classification. In this study, we analyze the documents for each individual item and select the words that are irrelevant for all categories as neutral words. We extract the words around the selected neutral word and use it to generate the term-document matrix. The neutral word itself starts with the idea that the stock movement is less related to the existence of the neutral words, and that the surrounding words of the neutral word are more likely to affect the stock price movements. And apply it to the algorithm that classifies the stock price fluctuations with the generated term-document matrix. In this study, we firstly removed stop words and selected neutral words for each stock. And we used a method to exclude words that are included in news articles for other stocks among the selected words. Through the online news portal, we collected four months of news articles on the top 10 market cap stocks. We split the news articles into 3 month news data as training data and apply the remaining one month news articles to the model to predict the stock price movements of the next day. We used SVM, Boosting and Random Forest for building models and predicting the movements of stock prices. The stock market opened for four months (2016/02/01 ~ 2016/05/31) for a total of 80 days, using the initial 60 days as a training set and the remaining 20 days as a test set. The proposed word - based algorithm in this study showed better classification performance than the word selection method based on sparsity. This study predicted stock price volatility by collecting and analyzing news articles of the top 10 stocks in market cap. We used the term - document matrix based classification model to estimate the stock price fluctuations and compared the performance of the existing sparse - based word extraction method and the suggested method of removing words from the term - document matrix. The suggested method differs from the word extraction method in that it uses not only the news articles for the corresponding stock but also other news items to determine the words to extract. In other words, it removed not only the words that appeared in all the increase and decrease but also the words that appeared common in the news for other stocks. When the prediction accuracy was compared, the suggested method showed higher accuracy. The limitation of this study is that the stock price prediction was set up to classify the rise and fall, and the experiment was conducted only for the top ten stocks. The 10 stocks used in the experiment do not represent the entire stock market. In addition, it is difficult to show the investment performance because stock price fluctuation and profit rate may be different. Therefore, it is necessary to study the research using more stocks and the yield prediction through trading simulation.

User-Perspective Issue Clustering Using Multi-Layered Two-Mode Network Analysis (다계층 이원 네트워크를 활용한 사용자 관점의 이슈 클러스터링)

  • Kim, Jieun;Kim, Namgyu;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.93-107
    • /
    • 2014
  • In this paper, we report what we have observed with regard to user-perspective issue clustering based on multi-layered two-mode network analysis. This work is significant in the context of data collection by companies about customer needs. Most companies have failed to uncover such needs for products or services properly in terms of demographic data such as age, income levels, and purchase history. Because of excessive reliance on limited internal data, most recommendation systems do not provide decision makers with appropriate business information for current business circumstances. However, part of the problem is the increasing regulation of personal data gathering and privacy. This makes demographic or transaction data collection more difficult, and is a significant hurdle for traditional recommendation approaches because these systems demand a great deal of personal data or transaction logs. Our motivation for presenting this paper to academia is our strong belief, and evidence, that most customers' requirements for products can be effectively and efficiently analyzed from unstructured textual data such as Internet news text. In order to derive users' requirements from textual data obtained online, the proposed approach in this paper attempts to construct double two-mode networks, such as a user-news network and news-issue network, and to integrate these into one quasi-network as the input for issue clustering. One of the contributions of this research is the development of a methodology utilizing enormous amounts of unstructured textual data for user-oriented issue clustering by leveraging existing text mining and social network analysis. In order to build multi-layered two-mode networks of news logs, we need some tools such as text mining and topic analysis. We used not only SAS Enterprise Miner 12.1, which provides a text miner module and cluster module for textual data analysis, but also NetMiner 4 for network visualization and analysis. Our approach for user-perspective issue clustering is composed of six main phases: crawling, topic analysis, access pattern analysis, network merging, network conversion, and clustering. In the first phase, we collect visit logs for news sites by crawler. After gathering unstructured news article data, the topic analysis phase extracts issues from each news article in order to build an article-news network. For simplicity, 100 topics are extracted from 13,652 articles. In the third phase, a user-article network is constructed with access patterns derived from web transaction logs. The double two-mode networks are then merged into a quasi-network of user-issue. Finally, in the user-oriented issue-clustering phase, we classify issues through structural equivalence, and compare these with the clustering results from statistical tools and network analysis. An experiment with a large dataset was performed to build a multi-layer two-mode network. After that, we compared the results of issue clustering from SAS with that of network analysis. The experimental dataset was from a web site ranking site, and the biggest portal site in Korea. The sample dataset contains 150 million transaction logs and 13,652 news articles of 5,000 panels over one year. User-article and article-issue networks are constructed and merged into a user-issue quasi-network using Netminer. Our issue-clustering results applied the Partitioning Around Medoids (PAM) algorithm and Multidimensional Scaling (MDS), and are consistent with the results from SAS clustering. In spite of extensive efforts to provide user information with recommendation systems, most projects are successful only when companies have sufficient data about users and transactions. Our proposed methodology, user-perspective issue clustering, can provide practical support to decision-making in companies because it enhances user-related data from unstructured textual data. To overcome the problem of insufficient data from traditional approaches, our methodology infers customers' real interests by utilizing web transaction logs. In addition, we suggest topic analysis and issue clustering as a practical means of issue identification.