• Title/Summary/Keyword: Topic Label

Search Result 23, Processing Time 0.025 seconds

Major concerns regarding food services based on news media reports during the COVID-19 outbreak using the topic modeling approach

  • Yoon, Hyejin;Kim, Taejin;Kim, Chang-Sik;Kim, Namgyu
    • Nutrition Research and Practice
    • /
    • v.15 no.sup1
    • /
    • pp.110-121
    • /
    • 2021
  • BACKGROUND/OBJECTIVES: Coronavirus disease 2019 (COVID-19) cases were first reported in December 2019, in China, and an increasing number of cases have since been detected all over the world. The purpose of this study was to collect significant news media reports on food services during the COVID-19 crisis and identify public communication and significant concerns regarding COVID-19 for suggesting future directions for the food industry and services. SUBJECTS/METHODS: News articles pertaining to food services were extracted from the home pages of major news media websites such as BBC, CNN, and Fox News between March 2020 and February 2021. The retrieved data was sorted and analyzed using Python software. RESULTS: The results of text analytics were presented in the format of the topic label and category for individual topics. The food and health category presented the effects of the COVID-19 pandemic on food and health, such as an increase in delivery services. The policy category was indicative of a change in government policy. The lifestyle change category addressed topics such as an increase in social media usage. CONCLUSIONS: This study is the first to analyze major news media (i.e., BBC, CNN, and Fox News) data related to food services in the context of the COVID-19 pandemic. Text analytics research on the food services domain revealed different categories such as food and health, policy, and lifestyle change. Therefore, this study contributes to the body of knowledge on food services research, through the use of text analytics to elicit findings from media sources.

Generating a Korean Sentiment Lexicon Through Sentiment Score Propagation (감정점수의 전파를 통한 한국어 감정사전 생성)

  • Park, Ho-Min;Kim, Chang-Hyun;Kim, Jae-Hoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.2
    • /
    • pp.53-60
    • /
    • 2020
  • Sentiment analysis is the automated process of understanding attitudes and opinions about a given topic from written or spoken text. One of the sentiment analysis approaches is a dictionary-based approach, in which a sentiment dictionary plays an much important role. In this paper, we propose a method to automatically generate Korean sentiment lexicon from the well-known English sentiment lexicon called VADER (Valence Aware Dictionary and sEntiment Reasoner). The proposed method consists of three steps. The first step is to build a Korean-English bilingual lexicon using a Korean-English parallel corpus. The bilingual lexicon is a set of pairs between VADER sentiment words and Korean morphemes as candidates of Korean sentiment words. The second step is to construct a bilingual words graph using the bilingual lexicon. The third step is to run the label propagation algorithm throughout the bilingual graph. Finally a new Korean sentiment lexicon is generated by repeatedly applying the propagation algorithm until the values of all vertices converge. Empirically, the dictionary-based sentiment classifier using the Korean sentiment lexicon outperforms machine learning-based approaches on the KMU sentiment corpus and the Naver sentiment corpus. In the future, we will apply the proposed approach to generate multilingual sentiment lexica.

A Circle Labeling Scheme without Re-labeling for Dynamically Updatable XML Data (동적으로 갱신가능한 XML 데이터에서 레이블 재작성하지 않는 원형 레이블링 방법)

  • Kim, Jin-Young;Park, Seog
    • Journal of KIISE:Databases
    • /
    • v.36 no.2
    • /
    • pp.150-167
    • /
    • 2009
  • XML has become the new standard for storing, exchanging, and publishing of data over both the internet and the ubiquitous data stream environment. As demand for efficiency in handling XML document grows, labeling scheme has become an important topic in data storage. Recently proposed labeling schemes reflect the dynamic XML environment, which itself provides motivation for the discovery of an efficient labeling scheme. However, previous proposed labeling schemes have several problems: 1) An insertion of a new node into the XML document triggers re-labeling of pre-existing nodes. 2) They need larger memory space to store total label. etc. In this paper, we introduce a new labeling scheme called a Circle Labeling Scheme. In CLS, XML documents are represented in a circular form, and efficient storage of labels is supported by the use of concepts Rotation Number and Parent Circle/Child Circle. The concept of Radius is applied to support inclusion of new nodes at arbitrary positions in the tree. This eliminates the need for re-labeling existing nodes and the need to increase label length, and mitigates conflict with existing labels. A detailed experimental study demonstrates efficiency of CLS.

IPC Multi-label Classification based on Functional Characteristics of Fields in Patent Documents (특허문서 필드의 기능적 특성을 활용한 IPC 다중 레이블 분류)

  • Lim, Sora;Kwon, YongJin
    • Journal of Internet Computing and Services
    • /
    • v.18 no.1
    • /
    • pp.77-88
    • /
    • 2017
  • Recently, with the advent of knowledge based society where information and knowledge make values, patents which are the representative form of intellectual property have become important, and the number of the patents follows growing trends. Thus, it needs to classify the patents depending on the technological topic of the invention appropriately in order to use a vast amount of the patent information effectively. IPC (International Patent Classification) is widely used for this situation. Researches about IPC automatic classification have been studied using data mining and machine learning algorithms to improve current IPC classification task which categorizes patent documents by hand. However, most of the previous researches have focused on applying various existing machine learning methods to the patent documents rather than considering on the characteristics of the data or the structure of patent documents. In this paper, therefore, we propose to use two structural fields, technical field and background, considered as having impacts on the patent classification, where the two field are selected by applying of the characteristics of patent documents and the role of the structural fields. We also construct multi-label classification model to reflect what a patent document could have multiple IPCs. Furthermore, we propose a method to classify patent documents at the IPC subclass level comprised of 630 categories so that we investigate the possibility of applying the IPC multi-label classification model into the real field. The effect of structural fields of patent documents are examined using 564,793 registered patents in Korea, and 87.2% precision is obtained in the case of using title, abstract, claims, technical field and background. From this sequence, we verify that the technical field and background have an important role in improving the precision of IPC multi-label classification in IPC subclass level.

Deep Image Annotation and Classification by Fusing Multi-Modal Semantic Topics

  • Chen, YongHeng;Zhang, Fuquan;Zuo, WanLi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.1
    • /
    • pp.392-412
    • /
    • 2018
  • Due to the semantic gap problem across different modalities, automatically retrieval from multimedia information still faces a main challenge. It is desirable to provide an effective joint model to bridge the gap and organize the relationships between them. In this work, we develop a deep image annotation and classification by fusing multi-modal semantic topics (DAC_mmst) model, which has the capacity for finding visual and non-visual topics by jointly modeling the image and loosely related text for deep image annotation while simultaneously learning and predicting the class label. More specifically, DAC_mmst depends on a non-parametric Bayesian model for estimating the best number of visual topics that can perfectly explain the image. To evaluate the effectiveness of our proposed algorithm, we collect a real-world dataset to conduct various experiments. The experimental results show our proposed DAC_mmst performs favorably in perplexity, image annotation and classification accuracy, comparing to several state-of-the-art methods.

Graphene Coated Optical Fiber SPR Biosensor

  • Kim, Jang Ah;Hwang, Taehyun;Dugasani, Sreekantha Reddy;Kulkarni, Atul;Park, Sung Ha;Kim, Taesung
    • Proceedings of the Korean Vacuum Society Conference
    • /
    • 2014.02a
    • /
    • pp.401-401
    • /
    • 2014
  • In this study, graphene, the most attractive material today, has been applied to the wavelength-modulated surface plasmon resonance (SPR) sensor. The optical fiber sensor technology is the most fascinating topic because of its several benefits. In addition to this, the SPR phenomenon enables the detection of biomaterials to be label-free, highly sensitive, and accurate. Therefore, the optical fiber SPR sensor has powerful advantages to detect biomaterials. Meanwhile, Graphene shows superior mechanical, electrical, and optical characteristics, so that it has tremendous potential to be applied to any applications. Especially, grapheme has tighter confinement plasmon and relatively long propagation distances, so that it can enhance the light-matter interactions (F. H. L. Koppens, et al., Nano Lett., 2011). Accordingly, we coated graphene on the optical fiber probe which we fabricated to compose the wavelength-modulated SPR sensor (Figure 1.). The graphene film was synthesized via thermal chemical vapor deposition (CVD) process. Synthesized graphene was transferred on the core exposed region of fiber optic by lift-off method. Detected analytes were biotinylated double cross-over DNA structure (DXB) and Streptavidin (SA) as the ligand-receptor binding model. The preliminary results showed the SPR signal shifts for the DXB and SA binding rather than the concentration change.

  • PDF

Learning Discriminative Fisher Kernel for Image Retrieval

  • Wang, Bin;Li, Xiong;Liu, Yuncai
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.3
    • /
    • pp.522-538
    • /
    • 2013
  • Content based image retrieval has become an increasingly important research topic for its wide application. It is highly challenging when facing to large-scale database with large variance. The retrieval systems rely on a key component, the predefined or learned similarity measures over images. We note that, the similarity measures can be potential improved if the data distribution information is exploited using a more sophisticated way. In this paper, we propose a similarity measure learning approach for image retrieval. The similarity measure, so called Fisher kernel, is derived from the probabilistic distribution of images and is the function over observed data, hidden variable and model parameters, where the hidden variables encode high level information which are powerful in discrimination and are failed to be exploited in previous methods. We further propose a discriminative learning method for the similarity measure, i.e., encouraging the learned similarity to take a large value for a pair of images with the same label and to take a small value for a pair of images with distinct labels. The learned similarity measure, fully exploiting the data distribution, is well adapted to dataset and would improve the retrieval system. We evaluate the proposed method on Corel-1000, Corel5k, Caltech101 and MIRFlickr 25,000 databases. The results show the competitive performance of the proposed method.

Regulatory innovation for expansion of indications and pediatric drug development

  • Park, Min Soo
    • Translational and Clinical Pharmacology
    • /
    • v.26 no.4
    • /
    • pp.155-159
    • /
    • 2018
  • For regulatory approval of a new drug, the most preferred and reliable source of evidence would be randomized controlled trials (RCT). However, a great number of drugs, being developed as well as already marketed and being used, usually lack proper indications for children. It is imperative to develop properly evaluated drugs for children. And expanding the use of already approved drugs for other indications will benefit patients and the society. Nevertheless, to get an approval for expansion of indications, most often with off-label experiences, for drugs that have been approved or for the development of pediatric indications, either during or after completing the main drug development, conducting RCTs may not be the only, if not right, way to take. Extrapolation strategies and modelling & simulation for pediatric drug development are paving the road to the better approval scheme. Making the use of data sources other than RCT such as EHR and claims data in ways that improve the efficiency and validity of the results (e.g., randomized pragmatic trial and randomized registry trial) has been the topic of great interest all around the world. Regulatory authorities should adopt new methodologies for regulatory approval processes to adapt to the changes brought by increasing availability of big and real world data utilizing new tools of technological advancement.

A Proposal of a Keyword Extraction System for Detecting Social Issues (사회문제 해결형 기술수요 발굴을 위한 키워드 추출 시스템 제안)

  • Jeong, Dami;Kim, Jaeseok;Kim, Gi-Nam;Heo, Jong-Uk;On, Byung-Won;Kang, Mijung
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.1-23
    • /
    • 2013
  • To discover significant social issues such as unemployment, economy crisis, social welfare etc. that are urgent issues to be solved in a modern society, in the existing approach, researchers usually collect opinions from professional experts and scholars through either online or offline surveys. However, such a method does not seem to be effective from time to time. As usual, due to the problem of expense, a large number of survey replies are seldom gathered. In some cases, it is also hard to find out professional persons dealing with specific social issues. Thus, the sample set is often small and may have some bias. Furthermore, regarding a social issue, several experts may make totally different conclusions because each expert has his subjective point of view and different background. In this case, it is considerably hard to figure out what current social issues are and which social issues are really important. To surmount the shortcomings of the current approach, in this paper, we develop a prototype system that semi-automatically detects social issue keywords representing social issues and problems from about 1.3 million news articles issued by about 10 major domestic presses in Korea from June 2009 until July 2012. Our proposed system consists of (1) collecting and extracting texts from the collected news articles, (2) identifying only news articles related to social issues, (3) analyzing the lexical items of Korean sentences, (4) finding a set of topics regarding social keywords over time based on probabilistic topic modeling, (5) matching relevant paragraphs to a given topic, and (6) visualizing social keywords for easy understanding. In particular, we propose a novel matching algorithm relying on generative models. The goal of our proposed matching algorithm is to best match paragraphs to each topic. Technically, using a topic model such as Latent Dirichlet Allocation (LDA), we can obtain a set of topics, each of which has relevant terms and their probability values. In our problem, given a set of text documents (e.g., news articles), LDA shows a set of topic clusters, and then each topic cluster is labeled by human annotators, where each topic label stands for a social keyword. For example, suppose there is a topic (e.g., Topic1 = {(unemployment, 0.4), (layoff, 0.3), (business, 0.3)}) and then a human annotator labels "Unemployment Problem" on Topic1. In this example, it is non-trivial to understand what happened to the unemployment problem in our society. In other words, taking a look at only social keywords, we have no idea of the detailed events occurring in our society. To tackle this matter, we develop the matching algorithm that computes the probability value of a paragraph given a topic, relying on (i) topic terms and (ii) their probability values. For instance, given a set of text documents, we segment each text document to paragraphs. In the meantime, using LDA, we can extract a set of topics from the text documents. Based on our matching process, each paragraph is assigned to a topic, indicating that the paragraph best matches the topic. Finally, each topic has several best matched paragraphs. Furthermore, assuming there are a topic (e.g., Unemployment Problem) and the best matched paragraph (e.g., Up to 300 workers lost their jobs in XXX company at Seoul). In this case, we can grasp the detailed information of the social keyword such as "300 workers", "unemployment", "XXX company", and "Seoul". In addition, our system visualizes social keywords over time. Therefore, through our matching process and keyword visualization, most researchers will be able to detect social issues easily and quickly. Through this prototype system, we have detected various social issues appearing in our society and also showed effectiveness of our proposed methods according to our experimental results. Note that you can also use our proof-of-concept system in http://dslab.snu.ac.kr/demo.html.

A Review of SERS for Biomaterials Analysis Using Metal Nanoparticles (바이오 물질 분석을 위한 금속 나노입자를 이용한 SERS 분석 연구동향)

  • Jang, Eue-Soon
    • Ceramist
    • /
    • v.22 no.3
    • /
    • pp.281-300
    • /
    • 2019
  • Surface enhanced Raman scattering (SERS) was first discovered in 1974 by an unexpected Raman signal increase from Pyridine adsorbed on rough Ag electrode surfaces by the M. Fleishmann group. M. Moskovits group suggested that this phenomenon could be caused by surface plasmon resonance (SPR), which is a collective oscillation of free electrons at the surface of metal nanostructures by an external light source. After about 40 years, the SERS study has attracted great attention as a biomolecule analysis technology, and more than 2500 new papers and 500 review papers related to SERS topic have been published each year in recently. The advantages of biomaterials analysis using SERS are as follows; ① Molecular level analysis is possible based on unique fingerprint information of biomolecule, ② There is no photo-bleaching effect of the Raman reporters, allowing long-term monitoring of biomaterials compared to fluorescence microscopy, ③ SERS peak bandwidth is approximately 10 to 100 times narrower than fluorescence emission from organic phosphor or quantum dot, resulting in higher analysis accuracy, ④ Single excitation wavelength allows analysis of various biomaterials, ⑤ By utilizing near-infrared (NIR) SERS-activated nanostructures and NIR excitation lasers, auto-fluorescence noise in the visible wavelength range can be avoided from in vivo experiment and light damage in living cells can be minimized compared to visible lasers, ⑥ The weak Raman signal of the water molecule makes it easy to analyze biomaterials in aqueous solutions. For this reason, SERS is attracting attention as a next-generation non-invasive medical diagnostic device as well as substance analysis. In this review, the principles of SERS and various biomaterial analysis principles using SERS analysis will be introduced through recent research papers.