• Title/Summary/Keyword: Korean news articles

Search Result 330, Processing Time 0.027 seconds

Coreference Resolution for Korean Using Random Forests (랜덤 포레스트를 이용한 한국어 상호참조 해결)

  • Jeong, Seok-Won;Choi, MaengSik;Kim, HarkSoo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.11
    • /
    • pp.535-540
    • /
    • 2016
  • Coreference resolution is to identify mentions in documents and is to group co-referred mentions in the documents. It is an essential step for natural language processing applications such as information extraction, event tracking, and question-answering. Recently, various coreference resolution models based on ML (machine learning) have been proposed, As well-known, these ML-based models need large training data that are manually annotated with coreferred mention tags. Unfortunately, we cannot find usable open data for learning ML-based models in Korean. Therefore, we propose an efficient coreference resolution model that needs less training data than other ML-based models. The proposed model identifies co-referred mentions using random forests based on sieve-guided features. In the experiments with baseball news articles, the proposed model showed a better CoNLL F1-score of 0.6678 than other ML-based models.

An Exploratory Study on the Policy for Facilitating of Health Behaviors Related to Particulate Matter: Using Topic and Semantic Network Analysis of Media Text (미세먼지 관련 건강행위 강화를 위한 정책의 탐색적 연구: 미디어 정보의 토픽 및 의미연결망 분석을 활용하여)

  • Byun, Hye Min;Park, You Jin;Yun, Eun Kyoung
    • Journal of Korean Academy of Nursing
    • /
    • v.51 no.1
    • /
    • pp.68-79
    • /
    • 2021
  • Purpose: This study aimed to analyze the mass and social media contents and structures related to particulate matter before and after the policy enforcement of the comprehensive countermeasures for particulate matter, derive nursing implications, and provide a basis for designing health policies. Methods: After crawling online news articles and posts on social networking sites before and after policy enforcement with particulate matter as keywords, we conducted topic and semantic network analysis using TEXTOM, R, and UCINET 6. Results: In topic analysis, behavior tips was the common main topic in both media before and after the policy enforcement. After the policy enforcement, influence on health disappeared from the main topics due to increased reports about reduction measures and government in mass media, whereas influence on health appeared as the main topic in social media. However semantic network analysis confirmed that social media had much number of nodes and links and lower centrality than mass media, leaving substantial information that was not organically connected and unstructured. Conclusion: Understanding of particulate matter policy and implications influence health, as well as gaps in the needs and use of health information, should be integrated with leadership and supports in the nurses' care of vulnerable patients and public health promotion.

Complaint-based Data Demands for Advancement of Environmental Impact Assessment (환경영향평가 고도화를 위한 평가항목별 민원기반 데이터 수요 도출 연구)

  • Choi, Yu-Young;Cho, Hyo-Jin;Hwang, Jin-Hoo;Kim, Yoon-Ji;Lim, No-Ol;Lee, Ji-Yeon;Lee, Jun-Hee;Sung, Min-Jun;Jeon, Seong-Woo;Sung, Hyun-Chan
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • v.24 no.6
    • /
    • pp.49-65
    • /
    • 2021
  • Although the Environmental Impact Assessment (EIA) is continuously being advanced, the number of environmental disputes regarding it is still on the rise. In order to supplement this, it is necessary to analyze the accumulated complaint cases. In this study, through the analysis of complaint cases, it is possible to identify matters that need to be improved in the existing EIA stages as well as various damages and conflicts that were not previously considered or predicted. In the process, we dervied 'complaint-based data demands' that should be additionally examined to improve the EIA. To this end, a total of 348 news articles were collected by searching with combinations of 'environmental impact assessment' and a keyword for each of the six assessment groups. As a result of analysis of collected data, a total of 54 complaint-based data demands were suggested. Among those were 15 items including 'impact of changes in seawater flow on water quality' in the category of water environment; 13 items including 'area of green buffer zone' in atmospheric environment; 10 items including 'impact of soundproof wall on wind corridor' in living environment; 8 items including 'expected number of users' in socioeconomic environment, 4 items including 'feasibility assessment of development site in terms of environmental and ecological aspects' in natural ecological environment; and 4 items including 'prediction of sediment runoff and damaged areas according to the increase in intensity and frequency of torrential rain' in land environment. In future research, more systematic complaint collection and analysis as well as specific provision methods regarding stages, subjects, and forms of use should be sought to apply the derived data demands in the actual EIA process. It is expected that this study can serve to advance the prediction and assessment of EIA in the future and to minimize environmental impact as well as social conflict in advance.

Analysis of news bigdata on 'Gather Town' using the Bigkinds system

  • Choi, Sui
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.3
    • /
    • pp.53-61
    • /
    • 2022
  • Recent years have drawn a great attention to generation MZ and Metaverse, due to 4th industrial revolution and the development of digital environment that blurs the boundary between reality and virtual reality. Generation MZ approaches the information very differently from the existing generations and uses distinguished communication methods. In terms of learning, they have different motivations, types, skills and build relationships differently. Meanwhile, Metaverse is drawing a great attention as a teaching method that fits traits of gen MZ. Thus, the current research aimed to investigate how to increase the use of Metaverse in Educational Technology. Specifically, this research examined the antecedents of popularity of Gather Town, a platform of Metaverse. Big data of news articles have been collected and analyzed using the Bigkinds system provided by Korea Press Foundation. The analysis revealed, first, a rapid increasing trend of media exposure of Gather Town since July 2021. This suggests a greater utilization of Gather Town in the field of education after the COVID-19 pandemic. Second, Word Association Analysis and Word Cloud Analysis showed high weights on education related words such as 'remote', 'university', and 'freshman', while words like 'Metaverse', 'Metaverse platform', 'Covid19', and 'Avatar' were also emphasized. Third, Network Analysis extracted 'COVID19', 'Avatar', 'University student', 'career', 'YouTube' as keywords. The findings also suggest potential value of Gather Town as an educational tool under COVID19 pandemic. Therefore, this research will contribute to the application and utilization of Gather Town in the field of education.

Study on the Analysis of National Paralympics by Utilizing Social Big Data Text Mining (소셜 빅데이터 텍스트 마이닝을 활용한 전국장애인체육대회 분석 연구)

  • Kim, Dae kyung;Lee, Hyun Su
    • 한국체육학회지인문사회과학편
    • /
    • v.55 no.6
    • /
    • pp.801-810
    • /
    • 2016
  • The purpose of the study was to conduct a text mining examining keywords related to the National Paralympics and provide the fundamental information that would be used to change perception of people without disabilities toward disabilities and to promote the social participation of people with and without disabilities in the National Paralympics. Social big data regarding the National Paralympics were retrieved from news articles and blog postings identified by search engines, Naver, Daum, and Google. The data were then analysed using R-3.3.1 Version Program. The analysing techniques were cloud analysis, correlation analysis and social network analysis. The results were as follows. First, news were mainly related to game results, sports events, team participation and host avenue of the 33rd ~ 36th National Paralympics. Second, search results about the 33rd ~ 36th National Paralympics between Naver, Daum, and Google were similar to one another. Thirds, the keywrods, National Paralympics, sports for the disabled, and sports, demonstrated a high close centrality. Further, degree centrality and betweenness centrality were associated in the keywords such as sports for all, participation, research, development, sports-disabled, research-disabled, sports for all-participation, disabled-participation, sports for all-disabled, and host-paralympics.

Corporate Social Responsibility (CSR) of Small Enterprises in Hospitality and Tourism Industry (환대관광산업 소규모기업 사회적 책임활동(CSR): 회사 홈페이지 커뮤니케이션 분석을 중심으로)

  • Ahn, Young-Joo
    • Journal of Distribution Science
    • /
    • v.15 no.7
    • /
    • pp.73-83
    • /
    • 2017
  • Purpose - The purpose of this paper is to explore the CSR activities of small enterprises in hospitality and tourism industry in South Korea. Since previous research on CSR activities has considerably focused on large enterprises whereas small enterprises have relatively less attention, this study aims to explore the characteristics of small enterprises in hospitality and tourism industry and their CSR activities. Research design, data, and methodology - The population of interest for this study was social enterprises registered in Korea Social Enterprise Promotion Agency (2016), and it was used to verify the social enterprises which has a certification for social enterprises. From 1672 companies in total, the sampling frame was a database with 117 companies in hospitality and tourism industry. This study investigates social enterprises' CSR activities on the company's official websites (e.g., company reports, magazines, the news articles, and interviews). The websites of the selected enterprises in hospitality and tourism industry were analyzed for examining CSR activities by the quantitative content analysis. All of the CSR activities in small social enterprises were classified into six dimensions based on the stakeholder theory. Results - The findings of this study provide the characteristics of the 117 small social enterprises and their specific CSR initiatives. A total of eight main business lines were identified: 1) fair travel, 2) leisure/sports, 3) accommodation/camping, 4) medical tourism, 5) exhibitions/art events/cultural events, 6) leisure activities for vulnerable social groups, 7) Korean traditional culture, and 8) ecotourism/agricultural tourism. The CSR initiatives were classified into six dimensions: 1) environment, 2) employment, 3) multicultural families and vulnerable social groups, 4) local community, 5) economic prosperity, and 6) product. Conclusions - This study revealed the special CSR initiative examples of small enterprises in hospitality and tourism industry. Small social enterprises participate in CSR activities mainly related to their own business lines. Moreover, these enterprises are more closely embedded in their local community development, job creation and education for local residents and vulnerable social groups, and traditional heritage preservation. The findings of this study provide theoretical and practical implications and they can contribute to enrich CSR with literature for small enterprises in hospitality and tourism industry.

Trend Analysis using Spatial-Temporal Visualization of Event Information based on Social Media (소셜 미디어에 기반한 이벤트 정보의 시공간적 시각화를 통한 추이 분석)

  • Oh, Hyo-Jung;Yun, Bo-Hyun;Yoo, Cheol-Jung;Kim, Yong
    • Journal of Internet Computing and Services
    • /
    • v.15 no.6
    • /
    • pp.65-75
    • /
    • 2014
  • The main focus of this paper is to analyze trend of event informations in a variety of mass media by graphical visualization in axis of the time and location. Especially, continuity analysis based on user-generated social media can reflect the social impact of a certain event according to change time and location and their directional changes. To reveal the characteristics of continuous events, we survey the data set collected from news articles and tweets during two years. Based on case studies on 'disease' and 'leisure', we verify the effectiveness and usefulness of our proposed method. Even though some events occurred during same period, we showed directional changes which have high-impact in social media referred user interest's, compared with fact-based continuous visualization results.

Dietary Education Support Act and Middle School Dietary Education - Focusing on the Dietary Section of the Revised 2007 Home Economics Textbooks (식생활교육지원법과 중학교 식생활교육 - 2007 개정 가정 교과서의 식생활 영역을 중심으로)

  • Kim, Ji-Hyun;Kim, Yoo-Kyung
    • Journal of Korean Home Economics Education Association
    • /
    • v.22 no.4
    • /
    • pp.1-13
    • /
    • 2010
  • The purpose of this study was to examine how the basic directions proposed in the Dietary Education Support Act were reflected in the dietary section of home economics textbooks for middle schools. The eleven different kinds of the 2007 revised textbooks were considered in the study. It was found that all of the textbooks considered reflected well in general the basic directions in the Dietary Support Act - formation of healthy dietary habits, promotion of dietary activities, practice of green dietary, preservation of traditional dietary, utilization of local food products, etc. in terms of their organization and description. However, it was also revealed that there were great differences among them in their treatment of visual materials like figures, photos, graphs, etc., news articles, and interesting anecdotal stories.

  • PDF

Named Entity Recognition and Dictionary Construction for Korean Title: Books, Movies, Music and TV Programs (한국어 제목 개체명 인식 및 사전 구축: 도서, 영화, 음악, TV프로그램)

  • Park, Yongmin;Lee, Jae Sung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.7
    • /
    • pp.285-292
    • /
    • 2014
  • A named entity recognition method is used to improve the performance of information retrieval systems, question answering systems, machine translation systems and so on. The targets of the named entity recognition are usually PLOs (persons, locations and organizations). They are usually proper nouns or unregistered words, and traditional named entity recognizers use these characteristics to find out named entity candidates. The titles of books, movies and TV programs have different characteristics than PLO entities. They are sometimes multiple phrases, one sentence, or special characters. This makes it difficult to find the named entity candidates. In this paper we propose a method to quickly extract title named entities from news articles and automatically build a named entity dictionary for the titles. For the candidates identification, the word phrases enclosed with special symbols in a sentence are firstly extracted, and then verified by the SVM with using feature words and their distances. For the classification of the extracted title candidates, SVM is used with the mutual information of word contexts.

An Efficient Damage Information Extraction from Government Disaster Reports

  • Shin, Sungho;Hong, Seungkyun;Song, Sa-Kwang
    • Journal of Internet Computing and Services
    • /
    • v.18 no.6
    • /
    • pp.55-63
    • /
    • 2017
  • One of the purposes of Information Technology (IT) is to support human response to natural and social problems such as natural disasters and spread of disease, and to improve the quality of human life. Recent climate change has happened worldwide, natural disasters threaten the quality of life, and human safety is no longer guaranteed. IT must be able to support tasks related to disaster response, and more importantly, it should be used to predict and minimize future damage. In South Korea, the data related to the damage is checked out by each local government and then federal government aggregates it. This data is included in disaster reports that the federal government discloses by disaster case, but it is difficult to obtain raw data of the damage even for research purposes. In order to obtain data, information extraction may be applied to disaster reports. In the field of information extraction, most of the extraction targets are web documents, commercial reports, SNS text, and so on. There is little research on information extraction for government disaster reports. They are mostly text, but the structure of each sentence is very different from that of news articles and commercial reports. The features of the government disaster report should be carefully considered. In this paper, information extraction method for South Korea government reports in the word format is presented. This method is based on patterns and dictionaries and provides some additional ideas for tokenizing the damage representation of the text. The experiment result is F1 score of 80.2 on the test set. This is close to cutting-edge information extraction performance before applying the recent deep learning algorithms.