• Title/Summary/Keyword: False Positives

Search Result 170, Processing Time 0.022 seconds

Informatics for protein identification by tandem mass spectrometry; Focused on two most-widely applied algorithms, Mascot and SEQUEST

  • Sohn, Chang-Ho;Jung, Jin-Woo;Kang, Gum-Yong;Kim, Kwang-Pyo
    • Bioinformatics and Biosystems
    • /
    • v.1 no.2
    • /
    • pp.89-94
    • /
    • 2006
  • Mass spectrometry (MS) is widely applied for high throughput proteomics analysis. When large-scale proteome analysis experiments are performed, it generates massive amount of data. To search these proteomics data against protein databases, fully automated database search algorithms, such as Mascot and SEQUEST are routinely employed. At present, it is critical to reduce false positives and false negatives during such analysis. In this review we have focused on aspects of automated protein identification using tandem mass spectrometry (MS/MS) spectra and validation of the protein identifications of two most common automated protein identification algorithms Mascot and SEQUEST.

  • PDF

Classification of Human Papillomavirus (HPV) Risk Type via Text Mining

  • Park, Seong-Bae;Hwang, Sohyun;Zhang, Byoung-Tak
    • Genomics & Informatics
    • /
    • v.1 no.2
    • /
    • pp.80-86
    • /
    • 2003
  • Human Papillomavirus (HPV) infection is known as the main factor for cervical cancer which is a leading cause of cancer deaths in women worldwide. Because there are more than 100 types in HPV, it is critical to discriminate the HPVs related with cervical cancer from those not related with it. In this paper, the risk type of HPVs using their textual explanation. The important issue in this problem is to distinguish false negatives from false positives. That is, we must find high-risk HPVs as many as possible though we may miss some low-risk HPVs. For this purpose, the AdaCost, a cost-sensitive learner is adopted to consider different costs between training examples. The experimental results on the HPV sequence database show that the consideration of costs gives higher performance. The improvement in F-score is higher than that of the accuracy, which implies that the number of high-risk HPVs found is increased.

Whole genome sequencing based noninvasive prenatal test

  • Cho, Eun-Hae
    • Journal of Genetic Medicine
    • /
    • v.12 no.2
    • /
    • pp.61-65
    • /
    • 2015
  • Whole genome sequencing (WGS)-based noninvasive prenatal test (NIPT) is the first method applied in the clinical setting out of various NIPT techniques. Several companies, such as Sequenom, BGI, and Illumina offer WGS-based NIPT, each with different technical and bioinformatic approaches. Sequenom, BGI, and Illumina utilize z-, t-, and L-scores, as well as normalized chromosome values, respectively, for trisomy detection. Their outstanding performance has been demonstrated in clinical studies of more than 100,000 pregnancies. The sensitivity and specificity for detection of trisomies 13, 18, and 21 were above 98%, as reported by all three companies. Unlike other techniques, WGS-based NIPT can detect other trisomies as well as clinically significant segmental duplications/deletions within a chromosome, which could expand the scope of NIPT. Incorrect results could be due to low fetal fraction, fetoplacental mosaicism, confined placental mosaicism or maternal copy number variation (CNV). Among those, maternal CNV is a significant contributor of false positive results and therefore genome wide scanning plays an important role in preventing the occurrence of false positives. In this article, the bioinformatic techniques and clinical performance of three major companies are comprehensively reviewed.

Learning-based Improvement of CFAR Algorithm for Increasing Node-level Event Detection Performance in Acoustic Sensor Networks (음향 센서 네트워크에서의 노드 레벨 이벤트 탐지 성능향상을 위한 학습 기반 CFAR 알고리즘 개선)

  • Kim, Youngsoo
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.15 no.5
    • /
    • pp.243-249
    • /
    • 2020
  • Event detection in wireless sensor networks is a key requirement in many applications. Acoustic sensors are one of the most frequently used sensors for event detection in sensor networks, but they are sensitive and difficult to handle because they vary greatly depending on the environment and target characteristics of the sensor field. In this paper, we propose a learning-based improvement of CFAR algorithm for increasing node-level event detection performance in acoustic sensor networks, and verify the effectiveness of the designed algorithm by comparing and evaluating the event detection performance with other algorithms. Our experimental results demonstrate the superiority of the proposed algorithm by increasing the detection accuracy by more than 45.16% by significantly reducing false positives by 7.97 times while slightly increasing the false negative compared to the existing algorithm.

Automatically Diagnosing Skull Fractures Using an Object Detection Method and Deep Learning Algorithm in Plain Radiography Images

  • Tae Seok, Jeong;Gi Taek, Yee; Kwang Gi, Kim;Young Jae, Kim;Sang Gu, Lee;Woo Kyung, Kim
    • Journal of Korean Neurosurgical Society
    • /
    • v.66 no.1
    • /
    • pp.53-62
    • /
    • 2023
  • Objective : Deep learning is a machine learning approach based on artificial neural network training, and object detection algorithm using deep learning is used as the most powerful tool in image analysis. We analyzed and evaluated the diagnostic performance of a deep learning algorithm to identify skull fractures in plain radiographic images and investigated its clinical applicability. Methods : A total of 2026 plain radiographic images of the skull (fracture, 991; normal, 1035) were obtained from 741 patients. The RetinaNet architecture was used as a deep learning model. Precision, recall, and average precision were measured to evaluate the deep learning algorithm's diagnostic performance. Results : In ResNet-152, the average precision for intersection over union (IOU) 0.1, 0.3, and 0.5, were 0.7240, 0.6698, and 0.3687, respectively. When the intersection over union (IOU) and confidence threshold were 0.1, the precision was 0.7292, and the recall was 0.7650. When the IOU threshold was 0.1, and the confidence threshold was 0.6, the true and false rates were 82.9% and 17.1%, respectively. There were significant differences in the true/false and false-positive/false-negative ratios between the anterior-posterior, towne, and both lateral views (p=0.032 and p=0.003). Objects detected in false positives had vascular grooves and suture lines. In false negatives, the detection performance of the diastatic fractures, fractures crossing the suture line, and fractures around the vascular grooves and orbit was poor. Conclusion : The object detection algorithm applied with deep learning is expected to be a valuable tool in diagnosing skull fractures.

Calibrating Thresholds to Improve the Detection Accuracy of Putative Transcription Factor Binding Sites

  • Kim, Young-Jin;Ryu, Gil-Mi;Park, Chan;Kim, Kyu-Won;Oh, Berm-Seok;Kim, Young-Youl;Gu, Man-Bok
    • Genomics & Informatics
    • /
    • v.5 no.4
    • /
    • pp.143-151
    • /
    • 2007
  • To understand the mechanism of transcriptional regulation, it is essential to detect promoters and regulatory elements. Various kinds of methods have been introduced to improve the prediction accuracy of regulatory elements. Since there are few experimentally validated regulatory elements, previous studies have used criteria based solely on the level of scores over background sequences. However, selecting the detection criteria for different prediction methods is not feasible. Here, we studied the calibration of thresholds to improve regulatory element prediction. We predicted a regulatory element using MATCH, which is a powerful tool for transcription factor binding site (TFBS) detection. To increase the prediction accuracy, we used a regulatory potential (RP) score measuring the similarity of patterns in alignments to those in known regulatory regions. Next, we calibrated the thresholds to find relevant scores, increasing the true positives while decreasing possible false positives. By applying various thresholds, we compared predicted regulatory elements with validated regulatory elements from the Open Regulatory Annotation (ORegAnno) database. The predicted regulators by the selected threshold were validated through enrichment analysis of muscle-specific gene sets from the Tissue-Specific Transcripts and Genes (T-STAG) database. We found 14 known muscle-specific regulators with a less than a 5% false discovery rate (FDR) in a single TFBS analysis, as well as known transcription factor combinations in our combinatorial TFBS analysis.

Identification of Caenorhabditis elegans MicroRNA Targets Using a Kernel Method

  • Lee, Wha-Jin;Nam, Jin-Wu;Kim, Sung-Kyu;Zhang, Byoung-Tak
    • Genomics & Informatics
    • /
    • v.3 no.1
    • /
    • pp.15-23
    • /
    • 2005
  • Background MicroRNAs (miRNAs) are a class of noncoding RNAs found in various organisms such as plants and mammals. However, most of the mRNAs regulated by miRNAs are unknown. Furthermore, miRNA targets in genomes cannot be identified by standard sequence comparison since their complementarity to the target sequence is imperfect in general. In this paper, we propose a kernel-based method for the efficient prediction of miRNA targets. To help in distinguishing the false positives from potentially valid targets, we elucidate the features common in experimentally confirmed targets. Results The performance of our prediction method was evaluated by five-fold cross-validation. Our method showed 0.64 and 0.98 in sensitivity and in specificity, respectively. Also, the proposed method reduced the number of false positives by half compared with TargetScan. We investigated the effect of feature sets on the classification of miRNA targets. Finally, we predicted miRNA targets for several miRNAs in the Caenorhabditis elegans (C. elegans) 3' untranslated region (3' UTR) database. Condusions The targets predicted by the suggested method will help in validating more miRNA targets and ultimately in revealing the role of small RNAs in the regulation of genomes. Our algorithm for miRNA target site detection will be able to be improved by additional experimental­knowledge. Also, the increase of the number of confirmed targets is expected to reveal general structural features that can be used to improve their detection.

Analysis of putative promoter sites in Babesia bovis rap-l and B equi ema-l intergenic nucleotides (Babesia bovis rap-1 및 B equi ema-1 intergenic 뉴클레오타이드에서 프로모터로 추정되는 위치 분석)

  • 곽동미
    • Korean Journal of Veterinary Service
    • /
    • v.27 no.1
    • /
    • pp.95-101
    • /
    • 2004
  • Babesia bovis rap-1 and B equi ema-1 intergenic(IG) nucleotides were analyzed and compared for identifying putative promoter sites using computer programs. The reason to initiate this research was to determine if IG nucleotides of Babesia genes that are predicted to be involved in erythrocyte invasion have functions regulating gene transcription and translation, which can be applied to functional gene knockout. Four IG sequences used included BbIG5(B bovis rap-1 5' IG), BblG3(B bovis rap-1 3' IG), BeIG5(B equi ema-1 5' IG) and BeIG3(B equi ema-1 3' IG). BbIG5 contained a putative promoter at nucleotide 197-246 with a predicted TATA-box and a transcription start site. BbIG3 had a putative promoter at nucleotide 270-320 with two predicted TATA-boxes and a transcription start site. BeIG3 had a putative promoter at nucleotide 155-205 with a predicted TATA-box and a transcription start site. Putative promoter sites in these three sequences mentioned above were identified with score cutoff 0.8, which means detection of about 40% recognized promoters with 0.1-0.4% false positives. In contrast, BeIG5 had a putative promoter at nucleotide 163-213 with score cutoff 0.8, but neither TATA-box nor transcription start site were recognized. However, BeIG5 had a putative promoter at nucleotide 388-438 with a predicted TATA-box and a transcription start site when score cutoff was decreased to 0.18, which means detection of about 70% recognized promoters with 2.2-5.3% false positives. These sequences with putative promoters can be tested if they have functions regulating gene transcription and translation.

Deep Learning Based Sign Detection and Recognition for the Blind (시각장애인을 위한 딥러닝 기반 표지판 검출 및 인식)

  • Jeon, Taejae;Lee, Sangyoun
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.54 no.2
    • /
    • pp.115-122
    • /
    • 2017
  • This paper proposes a deep learning algorithm based sign detection and recognition system for the blind. The proposed system is composed of sign detection stage and sign recognition stage. In the sign detection stage, aggregated channel features are extracted and AdaBoost classifier is applied to detect regions of interest of the sign. In the sign recognition stage, convolutional neural network is applied to recognize the regions of interest of the sign. In this paper, the AdaBoost classifier is designed to decrease the number of undetected signs, and deep learning algorithm is used to increase recognition accuracy and which leads to removing false positives which occur in the sign detection stage. Based on our experiments, proposed method efficiently decreases the number of false positives compared with other methods.

An Automatic Data Collection System for Human Pose using Edge Devices and Camera-Based Sensor Fusion (엣지 디바이스와 카메라 센서 퓨전을 활용한 사람 자세 데이터 자동 수집 시스템)

  • Young-Geun Kim;Seung-Hyeon Kim;Jung-Kon Kim;Won-Jung Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.1
    • /
    • pp.189-196
    • /
    • 2024
  • Frequent false positives alarm from the Intelligent Selective Control System have raised significant concerns. These persistent issues have led to declines in operational efficiency and market credibility among agents. Developing a new model or replacing the existing one to mitigate false positives alarm entails substantial opportunity costs; hence, improving the quality of the training dataset is pragmatic. However, smaller organizations face challenges with inadequate capabilities in dataset collection and refinement. This paper proposes an automatic human pose data collection system centered around a human pose estimation model, utilizing camera-based sensor fusion techniques and edge devices. The system facilitates the direct collection and real-time processing of field data at the network periphery, distributing the computational load that typically centralizes. Additionally, by directly labeling field data, it aids in constructing new training datasets.