• Title/Summary/Keyword: Recall and Precision

Search Result 715, Processing Time 0.022 seconds

The Blog Polarity Classification Technique using Opinion Mining (오피니언 마이닝을 활용한 블로그의 극성 분류 기법)

  • Lee, Jong-Hyuk;Lee, Won-Sang;Park, Jea-Won;Choi, Jae-Hyun
    • Journal of Digital Contents Society
    • /
    • v.15 no.4
    • /
    • pp.559-568
    • /
    • 2014
  • Previous polarity classification using sentiment analysis utilizes a sentence rule by product reviews based rating points. It is difficult to be applied to blogs which have not rating of product reviews and is possible to fabricate product reviews by comment part-timers and managers who use web site so it is not easy to understand a product and store reviews which are reliability. Considering to these problems, if we analyze blogs which have personal and frank opinions and classify polarity, it is possible to understand rightly opinions for the product, store. This paper suggests that we extract high frequency vocabularies in blogs by several domains and choose topic words. Then we apply a technique of sentiment analysis and classify polarity about contents of blogs. To evaluate performances of sentiment analysis, we utilize the measurement index that use Precision, Recall, F-Score in an information retrieval field. In a result of evaluation, using suggested sentiment analysis is the better performances to classify polarity than previous techniques of using the sentence rule based product reviews.

Implementation of Content Based Color Image Retrieval System using Wavelet Transformation Method (웨블릿 변환기법을 이용한 내용기반 컬러영상 검색시스템 구현)

  • 송석진;이희봉;김효성;남기곤
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.40 no.1
    • /
    • pp.20-27
    • /
    • 2003
  • In this paper, we implemented a content-based image retrieval system that user can choose a wanted query region of object and retrieve similar object from image database. Query image is induced to wavelet transformation after divided into hue components and gray components that hue features is extracted through color autocorrelogram and dispersion in hue components. Texture feature is extracted through autocorrelogram and GLCM in gray components also. Using features of two components, retrieval is processed to compare each similarity with database image. In here, weight value is applied to each similarity value. We make up for each defect by deriving features from two components beside one that elevations of recall and precision are verified in experiment results. Moreover, retrieval efficiency is improved by weight value. And various features of database images are indexed automatically in feature library that make possible to rapid image retrieval.

A Fuzzy Retrieval System to Facilitate Associated Learning in Problem Banks (문제 은행에서 연상학습을 지원하는 퍼지 검색 시스템)

  • Choi, Jae-hun;Kim, ji-Suk;Cho, Gi-Hwan
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.4
    • /
    • pp.278-288
    • /
    • 2002
  • This paper presents a design and implementation of fuzzy retrieval system that could support an associated learning in problem banks. It tries to retrieve some of the problems conceptually related to specific semantics described by user's queries. In particular, the problem retrieval system employs a fuzzy thesaurus which represents relationships between domain dependent vocabularies as fuzzy degrees. It would keep track of characteristics of the associated learning, which should guarantee high recall and acceptable precision for retrieval effectiveness. That is, since the thesaurus could make a vocabulary mismatch problem resolved among query terms and document index terms, this retrieval system could take a chance to effectively support user's associated teaming. Finally, we have evaluated whether the fuzzy retrieval system is appropriate for the associated teaming or not, by means of its precision and recall rate point of view.

FusionScan: accurate prediction of fusion genes from RNA-Seq data

  • Kim, Pora;Jang, Ye Eun;Lee, Sanghyuk
    • Genomics & Informatics
    • /
    • v.17 no.3
    • /
    • pp.26.1-26.12
    • /
    • 2019
  • Identification of fusion gene is of prominent importance in cancer research field because of their potential as carcinogenic drivers. RNA sequencing (RNA-Seq) data have been the most useful source for identification of fusion transcripts. Although a number of algorithms have been developed thus far, most programs produce too many false-positives, thus making experimental confirmation almost impossible. We still lack a reliable program that achieves high precision with reasonable recall rate. Here, we present FusionScan, a highly optimized tool for predicting fusion transcripts from RNA-Seq data. We specifically search for split reads composed of intact exons at the fusion boundaries. Using 269 known fusion cases as the reference, we have implemented various mapping and filtering strategies to remove false-positives without discarding genuine fusions. In the performance test using three cell line datasets with validated fusion cases (NCI-H660, K562, and MCF-7), FusionScan outperformed other existing programs by a considerable margin, achieving the precision and recall rates of 60% and 79%, respectively. Simulation test also demonstrated that FusionScan recovered most of true positives without producing an overwhelming number of false-positives regardless of sequencing depth and read length. The computation time was comparable to other leading tools. We also provide several curative means to help users investigate the details of fusion candidates easily. We believe that FusionScan would be a reliable, efficient and convenient program for detecting fusion transcripts that meet the requirements in the clinical and experimental community. FusionScan is freely available at http://fusionscan.ewha.ac.kr/.

Connected Component-Based and Size-Independent Caption Extraction with Neural Networks (신경망을 이용한 자막 크기에 무관한 연결 객체 기반의 자막 추출)

  • Jung, Je-Hee;Yoon, Tae-Bok;Kim, Dong-Moon;Lee, Jee-Hyong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.7
    • /
    • pp.924-929
    • /
    • 2007
  • Captions which appear in images include information that relates to the images. In order to obtain the information carried by captions, the methods for text extraction from images have been developed. However, most existing methods can be applied to captions with fixed height of stroke's width. We propose a method which can be applied to various caption size. Our method is based on connected components. And then the edge pixels are detected and grouped into connected components. We analyze the properties of connected components and build a neural network which discriminates connected components which include captions from ones which do not. Experimental data is collected from broadcast programs such as news, documentaries, and show programs which include various height caption. Experimental result is evaluated by two criteria : recall and precision. Recall is the ratio of the identified captions in all the captions in images and the precision is the ratio of the captions in the objects identified as captions. The experiment shows that the proposed method can efficiently extract captions various in size.

Histogram Matching Algorithm for Content-Based Dnage Retrieval (내용기반 영상검색을 위한 히스토그램 매칭 알고리즘)

  • You, Kang-Soo;Yoo, Gi-Hyoung;Kwak, Hoon-Sung
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.1C
    • /
    • pp.45-52
    • /
    • 2008
  • In this paper, we describe the Perceptually Weighted Histogram(PWH) and the Gaussian Weighted Histogram Intersection(GWHI) algorithms. These algorithms are able to provide positive results in image retrieval. But these histogram methods alter the histogram of an image by using particular lighting conditions. Even two pictures with little differences in lighting are not easily matched. Therefore, we propose that the Histogram Matching Algorithm(HMA) is able to overcome the problem of an image being changed by the intensity or color in the image retrieval. The proposed algorithm is insensitive to changes in the lighting. From the experiment results, the proposed algorithm can achieve up to 32% and up to 30% more recall than the PWH and GWHI algorithms, respectively. Also, it can achieve up to 38% and up to 34% more precision than PWH and GWHI, respectively Therefore, with our experiments, we are able to show that the proposed algorithm shows limited variation to changes in lighting.

Feature Extraction for Content-based Image Retrievaland Implementation of Image Database Retrieval System (내용기반 영상 검색을 위한 특징 추출 및 영상 데이터베이스 검색 시스템 구현)

  • Kim, Jin-Ah;Lee, Seung-Hoon;Woo, Yong-Tae;Jung, Sung-Hwan
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.8
    • /
    • pp.1951-1959
    • /
    • 1998
  • In this paper, we propose an efficient feature extaetion method for content-based approach and implement an image retrieval system in the Oracle database. First, we estract color feature by the modified Stricker's method from input images, and this color feature and ART2 neural network are used for the rough classification of images. Next, we extract texture feature using wavelet transform, and finally exeute the detailed classification on the rough classified images from the previous step. Exsing the proposed feature extraction methods, we implement a useful image retrieval system by Extended SQI, statement on the relational database. The proposed system is implemented on the Oracle DBMS, and in the experimental results with 200 sample images, it shows the retrieval rate 90% and 81% in Recall and Precision, respectively.

  • PDF

Detection of Artificial Caption using Temporal and Spatial Information in Video (시·공간 정보를 이용한 동영상의 인공 캡션 검출)

  • Joo, SungIl;Weon, SunHee;Choi, HyungIl
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.1 no.2
    • /
    • pp.115-126
    • /
    • 2012
  • The artificial captions appearing in videos include information that relates to the videos. In order to obtain the information carried by captions, many methods for caption extraction from videos have been studied. Most traditional methods of detecting caption region have used one frame. However video include not only spatial information but also temporal information. So we propose a method of detection caption region using temporal and spatial information. First, we make improved Text-Appearance-Map and detect continuous candidate regions through matching between candidate-regions. Second, we detect disappearing captions using disappearance test in candidate regions. In case of captions disappear, the caption regions are decided by a merging process which use temporal and spatial information. Final, we decide final caption regions through ANNs using edge direction histograms for verification. Our proposed method was experienced on many kinds of captions with a variety of sizes, shapes, positions and the experiment result was evaluated through Recall and Precision.

Arrhythmia Classification using GAN-based Over-Sampling Method and Combination Model of CNN-BLSTM (GAN 오버샘플링 기법과 CNN-BLSTM 결합 모델을 이용한 부정맥 분류)

  • Cho, Ik-Sung;Kwon, Hyeog-Soong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.10
    • /
    • pp.1490-1499
    • /
    • 2022
  • Arrhythmia is a condition in which the heart has an irregular rhythm or abnormal heart rate, early diagnosis and management is very important because it can cause stroke, cardiac arrest, or even death. In this paper, we propose arrhythmia classification using hybrid combination model of CNN-BLSTM. For this purpose, the QRS features are detected from noise removed signal through pre-processing and a single bit segment was extracted. In this case, the GAN oversampling technique is applied to solve the data imbalance problem. It consisted of CNN layers to extract the patterns of the arrhythmia precisely, used them as the input of the BLSTM. The weights were learned through deep learning and the learning model was evaluated by the validation data. To evaluate the performance of the proposed method, classification accuracy, precision, recall, and F1-score were compared by using the MIT-BIH arrhythmia database. The achieved scores indicate 99.30%, 98.70%, 97.50%, 98.06% in terms of the accuracy, precision, recall, F1 score, respectively.

Data Efficient Image Classification for Retinal Disease Diagnosis (데이터 효율적 이미지 분류를 통한 안질환 진단)

  • Honggu Kang;Huigyu Yang;Moonseong Kim;Hyunseung Choo
    • Journal of Internet Computing and Services
    • /
    • v.25 no.3
    • /
    • pp.19-25
    • /
    • 2024
  • The worldwide aging population trend is causing an increase in the incidence of major retinal diseases that can lead to blindness, including glaucoma, cataract, and macular degeneration. In the field of ophthalmology, there is a focused interest in diagnosing diseases that are difficult to prevent in order to reduce the rate of blindness. This study proposes a deep learning approach to accurately diagnose ocular diseases in fundus photographs using less data than traditional methods. For this, Convolutional Neural Network (CNN) models capable of effective learning with limited data were selected to classify Conventional Fundus Images (CFI) from various ocular disease patients. The chosen CNN models demonstrated exceptional performance, achieving high Accuracy, Precision, Recall, and F1-score values. This approach reduces manual analysis by ophthalmologists, shortens consultation times, and provides consistent diagnostic results, making it an efficient and accurate diagnostic tool in the medical field.