• Title/Summary/Keyword: news text

Search Result 378, Processing Time 0.03 seconds

Algorithm Design to Judge Fake News based on Bigdata and Artificial Intelligence

  • Kang, Jangmook;Lee, Sangwon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.11 no.2
    • /
    • pp.50-58
    • /
    • 2019
  • The clear and specific objective of this study is to design a false news discriminator algorithm for news articles transmitted on a text-based basis and an architecture that builds it into a system (H/W configuration with Hadoop-based in-memory technology, Deep Learning S/W design for bigdata and SNS linkage). Based on learning data on actual news, the government will submit advanced "fake news" test data as a result and complete theoretical research based on it. The need for research proposed by this study is social cost paid by rumors (including malicious comments) and rumors (written false news) due to the flood of fake news, false reports, rumors and stabbings, among other social challenges. In addition, fake news can distort normal communication channels, undermine human mutual trust, and reduce social capital at the same time. The final purpose of the study is to upgrade the study to a topic that is difficult to distinguish between false and exaggerated, fake and hypocrisy, sincere and false, fraud and error, truth and false.

Big Data Analysis of News on Purchasing Second-hand Clothing and Second-hand Luxury Goods: Identification of Social Perception and Current Situation Using Text Mining (중고의류와 중고명품 구매 관련 언론 보도 빅데이터 분석: 텍스트마이닝을 활용한 사회적 인식과 현황 파악)

  • Hwa-Sook Yoo
    • Human Ecology Research
    • /
    • v.61 no.4
    • /
    • pp.687-707
    • /
    • 2023
  • This study was conducted to obtain useful information on the development of the future second-hand fashion market by obtaining information on the current situation through unstructured text data distributed as news articles related to 'purchase of second-hand clothing' and 'purchase of second-hand luxury goods'. Text-based unstructured data was collected on a daily basis from Naver news from January 1st to December 31st, 2022, using 'purchase of second-hand clothing' and 'purchase of second-hand luxury goods' as collection keywords. This was analyzed using text mining, and the results are as follows. First, looking at the frequency, the collection data related to the purchase of second-hand luxury goods almost quadrupled compared to the data related to the purchase of second-hand clothing, indicating that the purchase of second-hand luxury goods is receiving more social attention. Second, there were common words between the data obtained by the two collection keywords, but they had different words. Regarding second-hand clothing, words related to donations, sharing, and compensation sales were mainly mentioned, indicating that the purchase of second-hand clothing tends to be recognized as an eco-friendly transaction. In second-hand luxury goods, resale and genuine controversy related to the transaction of second-hand luxury goods, second-hand trading platforms, and luxury brands were frequently mentioned. Third, as a result of clustering, data related to the purchase of second-hand clothing were divided into five groups, and data related to the purchase of second-hand luxury goods were divided into six groups.

Development of Education Materials as a Card News Format for Nutrition Management of Pregnant and Lactating Women (임신·수유부의 올바른 영양관리를 위한 카드뉴스 형식의 교육자료 개발)

  • Han, Young-Hee;Kim, Jung Hyun;Lee, Min Jun;Yoo, Taeksang;Hyun, Taisun
    • Korean Journal of Community Nutrition
    • /
    • v.22 no.3
    • /
    • pp.248-258
    • /
    • 2017
  • Objectives: The purpose of the study was to develop a series of education materials as a card news format to provide nutrition information for pregnant and lactating women. Methods: The materials were developed in seven steps. As a first step, the needs of pregnant and lactating women were assessed by reviewing scientific papers and existing education materials, and by interviewing a focus group. The second step was to construct main categories and the topics of information. In step 3, a draft of the contents in each topic was developed based on the scientific evidence. In step 4, a draft of card news was created by editors and designers by editing the text and embedding images in the card news. In step 5, the text, images and sequences were reviewed to improve readability by the members of the project team and nutrition experts. In step 6, parts of the text or images or the sequences of the card news were revised based on the reviews. In step 7, the card news were finalized and released online to the public. Results: A series of 26 card news for pregnant and lactating women were developed. The series covered five categories such as nutrition management, healthy food choices, food safety, favorites to avoid, nutrition management in special conditions for pregnant and lactating women. The satisfaction of 7 topics of the card news was evaluated by 140 pregnant women, and more than 70% of the women were satisfied with the materials. Conclusions: The card news format materials developed in this study are innovative nutrition education tools, and can be downloaded on the homepage of the Ministry of Food and Drug Safety. Those materials can be easily shared in social media by nutrition educators or by pregnant and lactating women to use.

Comparison of Industrial Mathematics Issues between Korea and the US Using Topic Modeling (토픽모델링을 활용한 한국과 미국의 산업수학 이슈 비교)

  • Kim, Sung-Yeun
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.7
    • /
    • pp.30-45
    • /
    • 2022
  • This study explored the issues of industrial mathematics in online news articles and online forums in Korea and the US by using text mining and compared the results. Text data about industrial mathematics were collected from news articles of Naver, a major portal site, and postings and replies on Clien as resources of Korea, and from news articles by the New York Times and CNN as well as postings and replies on Reddit as resources of the US. Structural topic modeling analyses were performed, the major results of which were as follows. First, news articles in Korea mainly dealt with the necessity of industrial mathematics and government support. On the contrary, the news articles in the US focused more on various fields where industrial mathematics fields were utilized. Second, in Korea, the same number of issues with different topics were discussed in news articles and online forums, whereas in the US more issues were covered in news articles than in online forums. It was suggested academic implications for researchers and practical implications for the government for settling industrial mathematics in Korea.

Neural Text Categorizer for Exclusive Text Categorization

  • Jo, Tae-Ho
    • Journal of Information Processing Systems
    • /
    • v.4 no.2
    • /
    • pp.77-86
    • /
    • 2008
  • This research proposes a new neural network for text categorization which uses alternative representations of documents to numerical vectors. Since the proposed neural network is intended originally only for text categorization, it is called NTC (Neural Text Categorizer) in this research. Numerical vectors representing documents for tasks of text mining have inherently two main problems: huge dimensionality and sparse distribution. Although many various feature selection methods are developed to address the first problem, the reduced dimension remains still large. If the dimension is reduced excessively by a feature selection method, robustness of text categorization is degraded. Even if SVM (Support Vector Machine) is tolerable to huge dimensionality, it is not so to the second problem. The goal of this research is to address the two problems at same time by proposing a new representation of documents and a new neural network using the representation for its input vector.

Table based Matching Algorithm for Soft Categorization of News Articles in Reuter 21578

  • Jo, Tae-Ho
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.6
    • /
    • pp.875-882
    • /
    • 2008
  • This research proposes an alternative approach to machine learning based ones for text categorization. For using machine learning based approaches for any task of text mining, documents should be encoded into numerical vectors; it causes two problems: huge dimensionality and sparse distribution. Although there are various tasks of text mining such as text categorization, text clustering, and text summarization, the scope of this research is restricted to text categorization. The idea of this research is to avoid the two problems by encoding a document or documents into a table, instead of numerical vectors. Therefore, the goal of this research is to improve the performance of text categorization by proposing approaches, which are free from the two problems.

  • PDF

The Image of Ruralism in Korea through a Text Mining for Online News Media analysis (인터넷 뉴스 데이터 텍스트 분석을 통해 본 우리나라 농촌다움에 대한 이미지 연구)

  • Son, Yong-hoon;Kim, Young-jin
    • Journal of Korean Society of Rural Planning
    • /
    • v.25 no.4
    • /
    • pp.13-26
    • /
    • 2019
  • The rural areas in South Korea have changed rapidly in the process of national land development. Rural landscapes have become discoloured, and their attractiveness has decreased as cities have expanded. But the attractiveness or multifunctional values of rural areas has become more important in contemporary society around the world. According to this social demand, the efforts of conserving the rural landscape are of high priority and the recovery of ruralism in the area is required. This study has tried to understand how the public image of ruralism in South Korea has been influenced by the news media. The study retrieved news articles using the web searching portal site from the six keywords, commonly used to refer to ruralism, including 'rural landscape', 'rural community', 'rural tourism', 'rural life', 'rural amenity', and 'rural environment'. News data from the six keywords were also collected respectively from within the year-period of 2004-05, 2007-08, 2012-13, and 2016-17. In the text mining analysis, the nouns with high Degree Centrality were figured out, and the changes by year-period were identified. Then, LDA topic analysis was performed for text datasets of six keywords. As a result, the study found that the news articles gave an informed focus on only a handful of issues such as 'poor rural living condition', 'regional or village improvement projects', 'rural tourism promotion projects', and 'other government support projects'. On the other hand, nouns related to virtues and values in the rural landscape were less shown in news articles. These results have become more apparent in recent years. In the topic analysis, 35 topics were identified. 'village development projects', 'rural tourism', and 'urban-rural exchange projects' were appeared repeatedly in several keywords. Among the topics, there are also topics closely related to ruralism such as 'rural landscape conservation', 'eco-friendly rural areas', 'local amenity resources', 'public interest values of agriculture', and 'rural life and communities'. The study presented an image map showing ruralism in South Korea using a network map between all topics and keywords. At the end of the study, implications for Korean rural area policy and research directions were discussed.

A Study on the Effect of the Document Summarization Technique on the Fake News Detection Model (문서 요약 기법이 가짜 뉴스 탐지 모형에 미치는 영향에 관한 연구)

  • Shim, Jae-Seung;Won, Ha-Ram;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.201-220
    • /
    • 2019
  • Fake news has emerged as a significant issue over the last few years, igniting discussions and research on how to solve this problem. In particular, studies on automated fact-checking and fake news detection using artificial intelligence and text analysis techniques have drawn attention. Fake news detection research entails a form of document classification; thus, document classification techniques have been widely used in this type of research. However, document summarization techniques have been inconspicuous in this field. At the same time, automatic news summarization services have become popular, and a recent study found that the use of news summarized through abstractive summarization has strengthened the predictive performance of fake news detection models. Therefore, the need to study the integration of document summarization technology in the domestic news data environment has become evident. In order to examine the effect of extractive summarization on the fake news detection model, we first summarized news articles through extractive summarization. Second, we created a summarized news-based detection model. Finally, we compared our model with the full-text-based detection model. The study found that BPN(Back Propagation Neural Network) and SVM(Support Vector Machine) did not exhibit a large difference in performance; however, for DT(Decision Tree), the full-text-based model demonstrated a somewhat better performance. In the case of LR(Logistic Regression), our model exhibited the superior performance. Nonetheless, the results did not show a statistically significant difference between our model and the full-text-based model. Therefore, when the summary is applied, at least the core information of the fake news is preserved, and the LR-based model can confirm the possibility of performance improvement. This study features an experimental application of extractive summarization in fake news detection research by employing various machine-learning algorithms. The study's limitations are, essentially, the relatively small amount of data and the lack of comparison between various summarization technologies. Therefore, an in-depth analysis that applies various analytical techniques to a larger data volume would be helpful in the future.

Analyzing Quotations in News Reporting from Western Foreign Press: Focusing on Evaluative Language

  • Ban, Hyun;Noh, Bokyung
    • International Journal of Advanced Culture Technology
    • /
    • v.4 no.3
    • /
    • pp.62-68
    • /
    • 2016
  • This study explores evaluative linguistic expressions in news reporting about the 2016 general election outcome in Korean newspapers. In particular, we have examined the evaluative linguistic expressions quoted from the three Western news media -New York Times, Washington Post, and BBC, both quantitatively and qualitatively in Korean news stories in order to know how journalists frame the news stories to persuade news consumers to accept their ideologies. This is based on the assumption that quotation can be a tool in conveying ideologies to news consumers (van Dijk, 1988, Jullian, 2011). To achieve this purpose, we selected ten Korean newspapers which included quotations from the news stories of the three Western media and then analyzed the quoted expressions quantitatively and qualitatively. For a qualitative analysis, evaluative linguistic expressions were analyzed to examine the journalistic stances of the Western news stories, following Martin's (2003) appraisal theory. For a quantitative analysis, a word frequency analysis was conducted to figure out the ratio of quoted words to the whole news texts in Korean newspapers. As a result, it was found that the news stories of BBC and Washington Post were more frequently quoted than that of New York Times when journalists conveyed neutral or positive attitude to the election outcome, thus confirming that evaluative linguistic expressions were functionally employed to convey journalists' ideologies or stances to news readers.

An Innovative Approach of Bangla Text Summarization by Introducing Pronoun Replacement and Improved Sentence Ranking

  • Haque, Md. Majharul;Pervin, Suraiya;Begum, Zerina
    • Journal of Information Processing Systems
    • /
    • v.13 no.4
    • /
    • pp.752-777
    • /
    • 2017
  • This paper proposes an automatic method to summarize Bangla news document. In the proposed approach, pronoun replacement is accomplished for the first time to minimize the dangling pronoun from summary. After replacing pronoun, sentences are ranked using term frequency, sentence frequency, numerical figures and title words. If two sentences have at least 60% cosine similarity, the frequency of the larger sentence is increased, and the smaller sentence is removed to eliminate redundancy. Moreover, the first sentence is included in summary always if it contains any title word. In Bangla text, numerical figures can be presented both in words and digits with a variety of forms. All these forms are identified to assess the importance of sentences. We have used the rule-based system in this approach with hidden Markov model and Markov chain model. To explore the rules, we have analyzed 3,000 Bangla news documents and studied some Bangla grammar books. A series of experiments are performed on 200 Bangla news documents and 600 summaries (3 summaries are for each document). The evaluation results demonstrate the effectiveness of the proposed technique over the four latest methods.