• Title/Summary/Keyword: Verification of News

Search Result 18, Processing Time 0.018 seconds

The Study on Speaker Change Verification Using SNR based weighted KL distance (SNR 기반 가중 KL 거리를 활용한 화자 변화 검증에 관한 연구)

  • Cho, Joon-Beom;Lee, Ji-eun;Lee, Kyong-Rok
    • Journal of Convergence for Information Technology
    • /
    • v.7 no.6
    • /
    • pp.159-166
    • /
    • 2017
  • In this paper, we have experimented to improve the verification performance of speaker change detection on broadcast news. It is to enhance the input noisy speech and to apply the KL distance $D_s$ using the SNR-based weighting function $w_m$. The basic experimental system is the verification system of speaker change using GMM-UBM based KL distance D(Experiment 0). Experiment 1 applies the input noisy speech enhancement using MMSE Log-STSA. Experiment 2 applies the new KL distance $D_s$ to the system of Experiment 1. Experiments were conducted under the condition of 0% MDR in order to prevent missing information of speaker change. The FAR of Experiment 0 was 71.5%. The FAR of Experiment 1 was 67.3%, which was 4.2% higher than that of Experiment 0. The FAR of experiment 2 was 60.7%, which was 10.8% higher than that of experiment 0.

Quantitative Text Mining for Social Science: Analysis of Immigrant in the Articles (사회과학을 위한 양적 텍스트 마이닝: 이주, 이민 키워드 논문 및 언론기사 분석)

  • Yi, Soo-Jeong;Choi, Doo-Young
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.5
    • /
    • pp.118-127
    • /
    • 2020
  • The paper introduces trends and methodological challenges of quantitative Korean text analysis by using the case studies of academic and news media articles on "migration" and "immigration" within the periods of 2017-2019. The quantitative text analysis based on natural language processing technology (NLP) and this became an essential tool for social science. It is a part of data science that converts documents into structured data and performs hypothesis discovery and verification as the data and visualize data. Furthermore, we examed the commonly applied social scientific statistical models of quantitative text analysis by using Natural Language Processing (NLP) with R programming and Quanteda.

Analyzing the Effect of Characteristics of Dictionary on the Accuracy of Document Classifiers (용어 사전의 특성이 문서 분류 정확도에 미치는 영향 연구)

  • Jung, Haegang;Kim, Namgyu
    • Management & Information Systems Review
    • /
    • v.37 no.4
    • /
    • pp.41-62
    • /
    • 2018
  • As the volume of unstructured data increases through various social media, Internet news articles, and blogs, the importance of text analysis and the studies are increasing. Since text analysis is mostly performed on a specific domain or topic, the importance of constructing and applying a domain-specific dictionary has been increased. The quality of dictionary has a direct impact on the results of the unstructured data analysis and it is much more important since it present a perspective of analysis. In the literature, most studies on text analysis has emphasized the importance of dictionaries to acquire clean and high quality results. However, unfortunately, a rigorous verification of the effects of dictionaries has not been studied, even if it is already known as the most essential factor of text analysis. In this paper, we generate three dictionaries in various ways from 39,800 news articles and analyze and verify the effect each dictionary on the accuracy of document classification by defining the concept of Intrinsic Rate. 1) A batch construction method which is building a dictionary based on the frequency of terms in the entire documents 2) A method of extracting the terms by category and integrating the terms 3) A method of extracting the features according to each category and integrating them. We compared accuracy of three artificial neural network-based document classifiers to evaluate the quality of dictionaries. As a result of the experiment, the accuracy tend to increase when the "Intrinsic Rate" is high and we found the possibility to improve accuracy of document classification by increasing the intrinsic rate of the dictionary.

A Proposed Private Blockchain System for Preserving Evidence of False Internet Communications (인터넷 허위통신 신고의 증거물 보존을 위한 프라이빗 블록체인 시스템 제안)

  • Bae, Suk-Min;Yang, Seong-Ryul;Jung, Jai-Jin
    • Journal of Convergence for Information Technology
    • /
    • v.9 no.11
    • /
    • pp.15-21
    • /
    • 2019
  • Allowing only authorized users to record and inquire in the ledger, private blockchain technology is attracting attention from institutions and companies. Based on distributed ledger technology, records are immutable. Because news via the Internet can be easily modified, the possibility of manipulation is high. Some false communication report systems are designed to prevent such harm. However, during the gap between the false communication report and verification time, contents on the website can be modified, or false evidence can be submitted intentionally. We propose a system that collects evidence using a headless browser for more accurate false communication management, and securely preserves evidence through a private blockchain and prevents possibilities of manipulation. The proposed system downloads original HTML, captures the website as an image, stores it in a transaction along with the report, and stores it in a private blockchain to ensure the integrity from acquisition to preservation of evidence.

Prediction of infectious diseases using multiple web data and LSTM (다중 웹 데이터와 LSTM을 사용한 전염병 예측)

  • Kim, Yeongha;Kim, Inhwan;Jang, Beakcheol
    • Journal of Internet Computing and Services
    • /
    • v.21 no.5
    • /
    • pp.139-148
    • /
    • 2020
  • Infectious diseases have long plagued mankind, and predicting and preventing them has been a big challenge for mankind. For this reasen, various studies have been conducted so far to predict infectious diseases. Most of the early studies relied on epidemiological data from the Centers for Disease Control and Prevention (CDC), and the problem was that the data provided by the CDC was updated only once a week, making it difficult to predict the number of real-time disease outbreaks. However, with the emergence of various Internet media due to the recent development of IT technology, studies have been conducted to predict the occurrence of infectious diseases through web data, and most of the studies we have researched have been using single Web data to predict diseases. However, disease forecasting through a single Web data has the disadvantage of having difficulty collecting large amounts of learning data and making accurate predictions through models for recent outbreaks such as "COVID-19". Thus, we would like to demonstrate through experiments that models that use multiple Web data to predict the occurrence of infectious diseases through LSTM models are more accurate than those that use single Web data and suggest models suitable for predicting infectious diseases. In this experiment, we predicted the occurrence of "Malaria" and "Epidemic-parotitis" using a single web data model and the model we propose. A total of 104 weeks of NEWS, SNS, and search query data were collected, of which 75 weeks were used as learning data and 29 weeks were used as verification data. In the experiment we predicted verification data using our proposed model and single web data, Pearson correlation coefficient for the predicted results of our proposed model showed the highest similarity at 0.94, 0.86, and RMSE was also the lowest at 0.19, 0.07.

A Study on Strategic Management of Native Advertisement (네이티브 광고의 전략적 관리방안에 관한 연구)

  • Son, Jeyoung;Kang, Inwon
    • Management & Information Systems Review
    • /
    • v.38 no.1
    • /
    • pp.63-81
    • /
    • 2019
  • In order to overcome the disadvantages of banner ad, pop-up ad, interstitial ad, which are existing web advertisement forms, native ad is actively utilized. Native advertising is considered to be a useful advertising technique in that it can reduce users' rejection and attract attention. However, in recent years, there have been a lot of fake news and fake contents that have turned articles or video contents into advertisements. The purpose of this study is to understand how firms can coordinate and control native advertisements in a rational way. For this analysis, we conducted a survey of 308 social media users using quota sampling method. As a result of the verification, it was found that the more negative the perception of the evaluation of the advertisement, the less the level of persuasion about the advertisement and the negative impact on the website where the advertisement is exposed. In addition, this study examined the influence of the negative stimulus factors on the qualitative performance of the firm. As a result, it was found that source non-expert had the highest effect on skepticism on ad. Also, platform overflow has a direct effect on the evaluation of the website as well as the negative evaluation of the advertisement. Moreover, this study provides concrete implications for the subdivision market by verifying the differences between the paths according to the level of website involvement.

Drought evaluation using unstructured data: a case study for Boryeong area (비정형 데이터를 활용한 가뭄평가 - 보령지역을 중심으로 -)

  • Jung, Jinhong;Park, Dong-Hyeok;Ahn, Jaehyun
    • Journal of Korea Water Resources Association
    • /
    • v.53 no.12
    • /
    • pp.1203-1210
    • /
    • 2020
  • Drought is caused by a combination of various hydrological or meteorological factor, so it is difficult to accurately assess drought event, but various drought indices have been developed to interpret them quantitatively. However, the drought indexes currently being used are calculated from the lack of a single variable, which is a problem that does not accurately determine the drought event caused by complex causes. Shortage of a single variable may not be a drought, but it is judged to be a drought. On the other hand, research on developing indices using unstructured data, which is widely used in big data analysis, is being carried out in other fields and proven to be superior. Therefore, in this study, we intend to calculate the drought index by combining unstructured data (news data) with weather and hydrologic information (rainfall and dam inflow) that are being used for the existing drought index, and to evaluate the utilization of drought interpretation through verification of the calculated drought index. The Clayton Copula function was used to calculate the joint drought index, and the parameter estimation was used by the calibration method. The analysis showed that the drought index, which combines unstructured data, properly expresses the drought period compared to the existing drought index (SPI, SDI). In addition, ROC scores were calculated higher than existing drought indices, making them more useful in drought interpretation. The joint drought index calculated in this study is considered highly useful in that it complements the analytical limits of the existing single variable drought index and provides excellent utilization of the drought index using unstructured data.

A Study on the Design Diagnostic Guideline in Crowdfunding for Makers (메이커스(Makers)를 위한 크라우드 펀딩 디자인 진단 가이드라인에 관한 연구)

  • Oh, In Kyun;Lee, Jang Woo
    • Korea Science and Art Forum
    • /
    • v.35
    • /
    • pp.281-292
    • /
    • 2018
  • Crowd funding is also called social funding because of SNS that it helps early start-up founder and makers to raise money for idea product production. Recently, the funding platform has recorded high growth rates. As a result, the government in Korea has introduced various support policies for the crowd funding. The purpose of this study is to develop a diagnostic design guideline for product design oriented makers based on the historical situation. The paper writer applied literature survey and expert interview as research methods. The literature survey focused on internet news and previous research studies. The expert interview was conducted for 10 specialist people and divided for the second time. As a result of the text survey, the current guideline was lacking in design and in detail. Researchers have been informed through previous paper that information transfer text and images are important factors for funding success. In the first interview with seven special participants, we made a draft design guideline for social funding with a two-step process and nine themes. We, research and three professional people having a evaluation experience, conducted verification and supplementation for establishing the design guider with a three-step process and eight themes in the next interview. The design guideline for crowd funding, it can be used by money funding manager apart from design makers. Through the results of this paper, researchers are expected to prevent problems and contribute to healthy crowd funding ecosystem development.