• 제목/요약/키워드: media text

검색결과 825건 처리시간 0.275초

Text Extraction in HIS Color Space by Weighting Scheme

  • Le, Thi Khue Van;Lee, Gueesang
    • 스마트미디어저널
    • /
    • 제2권1호
    • /
    • pp.31-36
    • /
    • 2013
  • A robust and efficient text extraction is very important for an accuracy of Optical Character Recognition (OCR) systems. Natural scene images with degradations such as uneven illumination, perspective distortion, complex background and multi color text give many challenges to computer vision task, especially in text extraction. In this paper, we propose a method for extraction of the text in signboard images based on a combination of mean shift algorithm and weighting scheme of hue and saturation in HSI color space for clustering algorithm. The number of clusters is determined automatically by mean shift-based density estimation, in which local clusters are estimated by repeatedly searching for higher density points in feature vector space. Weighting scheme of hue and saturation is used for formulation a new distance measure in cylindrical coordinate for text extraction. The obtained experimental results through various natural scene images are presented to demonstrate the effectiveness of our approach.

  • PDF

Text Mining and Visualization of Papers Reviews Using R Language

  • Li, Jiapei;Shin, Seong Yoon;Lee, Hyun Chang
    • Journal of information and communication convergence engineering
    • /
    • 제15권3호
    • /
    • pp.170-174
    • /
    • 2017
  • Nowadays, people share and discuss scientific papers on social media such as the Web 2.0, big data, online forums, blogs, Twitter, Facebook and scholar community, etc. In addition to a variety of metrics such as numbers of citation, download, recommendation, etc., paper review text is also one of the effective resources for the study of scientific impact. The social media tools improve the research process: recording a series online scholarly behaviors. This paper aims to research the huge amount of paper reviews which have generated in the social media platforms to explore the implicit information about research papers. We implemented and shown the result of text mining on review texts using R language. And we found that Zika virus was the research hotspot and association research methods were widely used in 2016. We also mined the news review about one paper and derived the public opinion.

소셜 미디어에서 사용되는 한국어 정서 단어의 정서가, 활성화 차원 측정 (Measuring a Valence and Activation Dimension of Korean Emotion Terms using in Social Media)

  • 이신영;고일주
    • 감성과학
    • /
    • 제16권2호
    • /
    • pp.167-176
    • /
    • 2013
  • 소셜 미디어의 급속한 발달로 인해 사용자가 생성한 텍스트 데이터가 급증하고 있다. 오피니언 마이닝에서는 이러한 사용자의 텍스트를 분석하여 사용자의 의견을 추출하고 있다. 특히 오피니언 마이닝의 세부 분야인 정서분석에서는 텍스트에서 사용자의 정서를 추출하는 것이 주된 목적인데, 이를 위해서는 정서 단어 목록 구축이 필수적이다. 본 논문에서는 소셜 미디어의 정서 분석을 위해서 대표적인 소셜 미디어인 페이스북 텍스트를 사용하여 정서 단어 목록을 구축하였다. 페이스북 텍스트로부터 데이터를 수집한 후 정서 단어를 선별하고 설문을 통하여 정서가와 활성화 차원을 측정하였다. 그 결과 정서가, 활성화 차원을 포함한 267개 정서 단어 목록을 구축하였다.

  • PDF

Media coverage of the conflicts over the 4th Industrial Revolution in the Republic of Korea from 2016 to 2020: a text-mining approach

  • Yang, Jiseong;Kim, Byungjun;Lee, Wonjae
    • Asian Journal of Innovation and Policy
    • /
    • 제11권2호
    • /
    • pp.202-221
    • /
    • 2022
  • The media has depicted an abrupt socio-technological change in the Republic of Korea with the 4th Industrial Revolution. Because technologies cannot realize their potential without social acceptance, studying conflicts incurred by such a change is imperative. However, little literature has focused on conflicts caused by technologies. Therefore, the current study investigated media coverage regarding conflicts related to the 4th Industrial Revolution from 2016 to 2020 in the Republic of Korea, applying text-mining techniques. We found that the overall amount and coverage pattern conforms to the issue attention cycle. Also, the three major topics ("SMEs & Startups," "Mobility Conflict," and "Human & Technology") indicate quarrels between conflicting social entities. Moreover, the temporal change in media coverage implies the political use of the term rather than technological. However, we also found the media's deliberative discussion on the socio-technological impact. This study is significant because we expanded the discussion on media coverage of technologies to the realm of social conflicts. Furthermore, we explored the news articles of the recent five years with a text-mining approach that enhanced the objectivity of the research.

대불호텔의 건축사적 고찰 (A Study on architectural historic of Hotel DIABUTSU)

  • 손장원;조희라
    • 한국디지털건축인테리어학회논문집
    • /
    • 제11권3호
    • /
    • pp.27-34
    • /
    • 2011
  • The DIABUTSU hotel was built first in Korea and we know that the hotel was built in 1888. However, it has many questions. This study was conducted to uncover the truth. Non-text media in the study is useful to take advantage of the media. However, it is not used in Korea. I prefer that study by Non-text Media. The findings, DIABUTSU hotel was built in 1884. It was Japanese-style two-story wooden building. HORI was hospitality there and many foreigners stayed. Underwood, Appenzeller and Carles were this hotel and they recorded about the hotel in 1885. We know that three story building was the first hotel. But this is wrong in fact. The first hotel is Japanese-style wooden building built in 1884.

Machine Printed and Handwritten Text Discrimination in Korean Document Images

  • Trieu, Son Tung;Lee, Guee Sang
    • 스마트미디어저널
    • /
    • 제5권3호
    • /
    • pp.30-34
    • /
    • 2016
  • Nowadays, there are a lot of Korean documents, which often need to be identified in one of printed or handwritten text. Early methods for the identification use structural features, which can be simple and easy to apply to text of a specific font, but its performance depends on the font type and characteristics of the text. Recently, the bag-of-words model has been used for the identification, which can be invariant to changes in font size, distortions or modifications to the text. The method based on bag-of-words model includes three steps: word segmentation using connected component grouping, feature extraction, and finally classification using SVM(Support Vector Machine). In this paper, bag-of-words model based method is proposed using SURF(Speeded Up Robust Feature) for the identification of machine printed and handwritten text in Korean documents. The experiment shows that the proposed method outperforms methods based on structural features.

텍스트 마이닝을 활용한 매스 미디어와 소셜 미디어 의제 분석 : '마스크 5부제'를 중심으로 (Mass Media and Social Media Agenda Analysis Using Text Mining : focused on '5-day Rotation Mask Distribution System')

  • 이새미;유승의;안순재
    • 한국콘텐츠학회논문지
    • /
    • 제20권6호
    • /
    • pp.460-469
    • /
    • 2020
  • 본 연구는 코로나19 사태로 인하여 최근 이슈로 떠오르는 '마스크 5부제'에 대한 온라인 뉴스 기사와 카페글을 분석하여 언론과 대중들의 반응을 담고 있는 매스 미디어와 소셜 미디어 의제를 파악하고, 그 차이점을 알아보았다. 분석을 위해 네이버 뉴스 기사 전문 2,096건과 카페글 1,840건을 수집하고 데이터 전처리 과정과 정제과정을 거쳐 단어 빈도분석, 워드 클라우드, LDA 토픽모델링 분석을 실시하였다. 분석 결과, 매스 미디어에 비해 소셜 미디어는 '대리 구매', '개학 연기', '마스크 사용', '마스크 구입'과 같이 실생활 관련 토픽이 나타나 개인 미디어의 특성이 반영되어 정보 전달의 기능 보다는 개인의 의견, 감정, 정보를 교류하는 역할을 하는 것으로 나타났다. 본 연구에 적용된 연구방법의 적용으로 다양한 미디어 분석을 통해 사회이슈가 공중의제화되고, 정부의제로 진화하는 정책의제설정 과정에서 참고자료로 활용될 수 있을 것이다.

Major concerns regarding food services based on news media reports during the COVID-19 outbreak using the topic modeling approach

  • Yoon, Hyejin;Kim, Taejin;Kim, Chang-Sik;Kim, Namgyu
    • Nutrition Research and Practice
    • /
    • 제15권sup1호
    • /
    • pp.110-121
    • /
    • 2021
  • BACKGROUND/OBJECTIVES: Coronavirus disease 2019 (COVID-19) cases were first reported in December 2019, in China, and an increasing number of cases have since been detected all over the world. The purpose of this study was to collect significant news media reports on food services during the COVID-19 crisis and identify public communication and significant concerns regarding COVID-19 for suggesting future directions for the food industry and services. SUBJECTS/METHODS: News articles pertaining to food services were extracted from the home pages of major news media websites such as BBC, CNN, and Fox News between March 2020 and February 2021. The retrieved data was sorted and analyzed using Python software. RESULTS: The results of text analytics were presented in the format of the topic label and category for individual topics. The food and health category presented the effects of the COVID-19 pandemic on food and health, such as an increase in delivery services. The policy category was indicative of a change in government policy. The lifestyle change category addressed topics such as an increase in social media usage. CONCLUSIONS: This study is the first to analyze major news media (i.e., BBC, CNN, and Fox News) data related to food services in the context of the COVID-19 pandemic. Text analytics research on the food services domain revealed different categories such as food and health, policy, and lifestyle change. Therefore, this study contributes to the body of knowledge on food services research, through the use of text analytics to elicit findings from media sources.

조현병 관련 주요 일간지 기사에 대한 텍스트 마이닝 분석 (Text-Mining Analyses of News Articles on Schizophrenia)

  • 남희정;류승형
    • 대한조현병학회지
    • /
    • 제23권2호
    • /
    • pp.58-64
    • /
    • 2020
  • Objectives: In this study, we conducted an exploratory analysis of the current media trends on schizophrenia using text-mining methods. Methods: First, web-crawling techniques extracted text data from 575 news articles in 10 major newspapers between 2018 and 2019, which were selected by searching "schizophrenia" in the Naver News. We had developed document-term matrix (DTM) and/or term-document matrix (TDM) through pre-processing techniques. Through the use of DTM and TDM, frequency analysis, co-occurrence network analysis, and topic model analysis were conducted. Results: Frequency analysis showed that keywords such as "police," "mental illness," "admission," "patient," "crime," "apartment," "lethal weapon," "treatment," "Jinju," and "residents" were frequently mentioned in news articles on schizophrenia. Within the article text, many of these keywords were highly correlated with the term "schizophrenia" and were also interconnected with each other in the co-occurrence network. The latent Dirichlet allocation model presented 10 topics comprising a combination of keywords: "police-Jinju," "hospital-admission," "research-finding," "care-center," "schizophrenia-symptom," "society-issue," "family-mind," "woman-school," and "disabled-facilities." Conclusion: The results of the present study highlight that in recent years, the media has been reporting violence in patients with schizophrenia, thereby raising an important issue of hospitalization and community management of patients with schizophrenia.

Analysis of Social Media Utilization based on Big Data-Focusing on the Chinese Government Weibo

  • Li, Xiang;Guo, Xiaoqin;Kim, Soo Kyun;Lee, Hyukku
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권8호
    • /
    • pp.2571-2586
    • /
    • 2022
  • The rapid popularity of government social media has generated huge amounts of text data, and the analysis of these data has gradually become the focus of digital government research. This study uses Python language to analyze the big data of the Chinese provincial government Weibo. First, this study uses a web crawler approach to collect and statistically describe over 360,000 data from 31 provincial government microblogs in China, covering the period from January 2018 to April 2022. Second, a word separation engine is constructed and these text data are analyzed using word cloud word frequencies as well as semantic relationships. Finally, the text data were analyzed for sentiment using natural language processing methods, and the text topics were studied using LDA algorithm. The results of this study show that, first, the number and scale of posts on the Chinese government Weibo have grown rapidly. Second, government Weibo has certain social attributes, and the epidemics, people's livelihood, and services have become the focus of government Weibo. Third, the contents of government Weibo account for more than 30% of negative sentiments. The classified topics show that the epidemics and epidemic prevention and control overshadowed the other topics, which inhibits the diversification of government Weibo.