• Title/Summary/Keyword: Pre-Processing

Search Result 2,013, Processing Time 0.023 seconds

CComparative evaluation of the methods of producing planar image results by using Q-Metrix method of SPECT/CT in Lung Perfusion Scan (Lung Perfusion scan에서 SPECT-CT의 Q-Metrix방법과 평면영상 결과 산출방법에 대한 비교평가)

  • Ha, Tae Hwan;Lim, Jung Jin;Do, Yong Ho;Cho, Sung Wook;Noh, Gyeong Woon
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.22 no.1
    • /
    • pp.90-97
    • /
    • 2018
  • Purpose The lung segment ratio which is obtained through quantitative analyses of lung perfusion scan images is calculated to evaluate the lung function pre and post surgery. In this Study, the planar image production methods by using Q-Metrix (GE Healthcare, USA) program capable of not only quantitative analysis but also computation of the segment ratio after having performed SPECT/CT are comparatively evaluated. Materials and Methods Lung perfusion scan and SPECT/CT were performed on 50 lung cancer patients prior to surgery who visited our hospital from May 1, 2015 to September 13, 2016 by using Discovery 670(GE Healthcare, USA) equipment. AP(Anterior Posterior)method that uses planar image divided the frontal and rear images into three rectangular portions by means of ROI tool while PO(Posterior Oblique)method computed the segment ratio by dividing the right lobe into three parts and the left lobe into two parts on the oblique image. Segment ratio was computed by setting the ROI and VOI in the CT image by using Q-Metrix program and statistically analysis was performed with SPSS Ver. 23. Results Regarding the correlation concordance rate of Q-Metrix and AP methods, RUL(Right upper lobe), RML(Right middle lobe) and RLL(Right lower lobe) were 0.224, 0.035 and 0.447. LUL(Left upper lobe) and LLL(Left lower lobe) were found to be 0.643 and 0.456, respectively. In the PO method, the right lobe were 0.663, 0.623 and 0.702, respectively, while the left lobe were 0.754 and 0.823. When comparison was made by using the Paired sample T-test, Right lobe were $11.6{\pm}4.5$, $26.9{\pm}6.2$ and $17.8{\pm}4.2$, respectively in the AP method. Left lobe were $28.4{\pm}4.8$ and $15.4{\pm}5.6$. The right lobe of PO had values of $17.4{\pm}5.0$, $10.5{\pm}3.6$ and $27.3{\pm}6.0$, while the left lobe had values of $21.6{\pm}4.8$ and $23.1{\pm}6.6$, thereby having statistically significant difference in comparison to the Q-Metrix method for each of the lobes (P<0.05). However, there was no statistically significant difference in Right middle lobe (P>0.05). Conclusion The AP method showed low concordance rate in correlation with the Q-Metrix method. However, PO method displayed high concordance rate overall. although AP method had significant differences in all lobes, there was no significant difference in Right middle lobe of PO method. Therefore, at the time of production of lung perfusion scan results, utilization of Q-Metrix method of SPECT/CT would be useful in computation of accurate resultant values. Moreover, it is deemed possible to expect obtain more practical sectional computation result values by using PO method at the time of planar image acquisition.

Aspect-Based Sentiment Analysis Using BERT: Developing Aspect Category Sentiment Classification Models (BERT를 활용한 속성기반 감성분석: 속성카테고리 감성분류 모델 개발)

  • Park, Hyun-jung;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.4
    • /
    • pp.1-25
    • /
    • 2020
  • Sentiment Analysis (SA) is a Natural Language Processing (NLP) task that analyzes the sentiments consumers or the public feel about an arbitrary object from written texts. Furthermore, Aspect-Based Sentiment Analysis (ABSA) is a fine-grained analysis of the sentiments towards each aspect of an object. Since having a more practical value in terms of business, ABSA is drawing attention from both academic and industrial organizations. When there is a review that says "The restaurant is expensive but the food is really fantastic", for example, the general SA evaluates the overall sentiment towards the 'restaurant' as 'positive', while ABSA identifies the restaurant's aspect 'price' as 'negative' and 'food' aspect as 'positive'. Thus, ABSA enables a more specific and effective marketing strategy. In order to perform ABSA, it is necessary to identify what are the aspect terms or aspect categories included in the text, and judge the sentiments towards them. Accordingly, there exist four main areas in ABSA; aspect term extraction, aspect category detection, Aspect Term Sentiment Classification (ATSC), and Aspect Category Sentiment Classification (ACSC). It is usually conducted by extracting aspect terms and then performing ATSC to analyze sentiments for the given aspect terms, or by extracting aspect categories and then performing ACSC to analyze sentiments for the given aspect category. Here, an aspect category is expressed in one or more aspect terms, or indirectly inferred by other words. In the preceding example sentence, 'price' and 'food' are both aspect categories, and the aspect category 'food' is expressed by the aspect term 'food' included in the review. If the review sentence includes 'pasta', 'steak', or 'grilled chicken special', these can all be aspect terms for the aspect category 'food'. As such, an aspect category referred to by one or more specific aspect terms is called an explicit aspect. On the other hand, the aspect category like 'price', which does not have any specific aspect terms but can be indirectly guessed with an emotional word 'expensive,' is called an implicit aspect. So far, the 'aspect category' has been used to avoid confusion about 'aspect term'. From now on, we will consider 'aspect category' and 'aspect' as the same concept and use the word 'aspect' more for convenience. And one thing to note is that ATSC analyzes the sentiment towards given aspect terms, so it deals only with explicit aspects, and ACSC treats not only explicit aspects but also implicit aspects. This study seeks to find answers to the following issues ignored in the previous studies when applying the BERT pre-trained language model to ACSC and derives superior ACSC models. First, is it more effective to reflect the output vector of tokens for aspect categories than to use only the final output vector of [CLS] token as a classification vector? Second, is there any performance difference between QA (Question Answering) and NLI (Natural Language Inference) types in the sentence-pair configuration of input data? Third, is there any performance difference according to the order of sentence including aspect category in the QA or NLI type sentence-pair configuration of input data? To achieve these research objectives, we implemented 12 ACSC models and conducted experiments on 4 English benchmark datasets. As a result, ACSC models that provide performance beyond the existing studies without expanding the training dataset were derived. In addition, it was found that it is more effective to reflect the output vector of the aspect category token than to use only the output vector for the [CLS] token as a classification vector. It was also found that QA type input generally provides better performance than NLI, and the order of the sentence with the aspect category in QA type is irrelevant with performance. There may be some differences depending on the characteristics of the dataset, but when using NLI type sentence-pair input, placing the sentence containing the aspect category second seems to provide better performance. The new methodology for designing the ACSC model used in this study could be similarly applied to other studies such as ATSC.

A study on the classification of research topics based on COVID-19 academic research using Topic modeling (토픽모델링을 활용한 COVID-19 학술 연구 기반 연구 주제 분류에 관한 연구)

  • Yoo, So-yeon;Lim, Gyoo-gun
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.155-174
    • /
    • 2022
  • From January 2020 to October 2021, more than 500,000 academic studies related to COVID-19 (Coronavirus-2, a fatal respiratory syndrome) have been published. The rapid increase in the number of papers related to COVID-19 is putting time and technical constraints on healthcare professionals and policy makers to quickly find important research. Therefore, in this study, we propose a method of extracting useful information from text data of extensive literature using LDA and Word2vec algorithm. Papers related to keywords to be searched were extracted from papers related to COVID-19, and detailed topics were identified. The data used the CORD-19 data set on Kaggle, a free academic resource prepared by major research groups and the White House to respond to the COVID-19 pandemic, updated weekly. The research methods are divided into two main categories. First, 41,062 articles were collected through data filtering and pre-processing of the abstracts of 47,110 academic papers including full text. For this purpose, the number of publications related to COVID-19 by year was analyzed through exploratory data analysis using a Python program, and the top 10 journals under active research were identified. LDA and Word2vec algorithm were used to derive research topics related to COVID-19, and after analyzing related words, similarity was measured. Second, papers containing 'vaccine' and 'treatment' were extracted from among the topics derived from all papers, and a total of 4,555 papers related to 'vaccine' and 5,971 papers related to 'treatment' were extracted. did For each collected paper, detailed topics were analyzed using LDA and Word2vec algorithms, and a clustering method through PCA dimension reduction was applied to visualize groups of papers with similar themes using the t-SNE algorithm. A noteworthy point from the results of this study is that the topics that were not derived from the topics derived for all papers being researched in relation to COVID-19 (

    ) were the topic modeling results for each research topic (
    ) was found to be derived from For example, as a result of topic modeling for papers related to 'vaccine', a new topic titled Topic 05 'neutralizing antibodies' was extracted. A neutralizing antibody is an antibody that protects cells from infection when a virus enters the body, and is said to play an important role in the production of therapeutic agents and vaccine development. In addition, as a result of extracting topics from papers related to 'treatment', a new topic called Topic 05 'cytokine' was discovered. A cytokine storm is when the immune cells of our body do not defend against attacks, but attack normal cells. Hidden topics that could not be found for the entire thesis were classified according to keywords, and topic modeling was performed to find detailed topics. In this study, we proposed a method of extracting topics from a large amount of literature using the LDA algorithm and extracting similar words using the Skip-gram method that predicts the similar words as the central word among the Word2vec models. The combination of the LDA model and the Word2vec model tried to show better performance by identifying the relationship between the document and the LDA subject and the relationship between the Word2vec document. In addition, as a clustering method through PCA dimension reduction, a method for intuitively classifying documents by using the t-SNE technique to classify documents with similar themes and forming groups into a structured organization of documents was presented. In a situation where the efforts of many researchers to overcome COVID-19 cannot keep up with the rapid publication of academic papers related to COVID-19, it will reduce the precious time and effort of healthcare professionals and policy makers, and rapidly gain new insights. We hope to help you get It is also expected to be used as basic data for researchers to explore new research directions.


  • (34141) Korea Institute of Science and Technology Information, 245, Daehak-ro, Yuseong-gu, Daejeon
    Copyright (C) KISTI. All Rights Reserved.