• Title/Summary/Keyword: False Positive data

Search Result 241, Processing Time 0.023 seconds

An Analysis on the Error Probability of A Bloom Filter (블룸필터의 오류 확률에 대한 분석)

  • Kim, SungYong;Kim, JiHong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.24 no.5
    • /
    • pp.809-815
    • /
    • 2014
  • As the size of the data is getting larger and larger due to improvement of the telecommunication techniques, it would be main issues to develop and process the database. The bloom filter used to lookup a particular element under the given set is very useful structure because of the space efficiency. In this paper, we introduce the error probabilities in Bloom filter. Especially, we derive the revised false positive rates of the Bloom filter using experimental method. Finally we analyze and compare the original false positive probability of the bloom filter used until now and the false decision probability proposed in this paper.

Statistical Analysis of Count Rate Data for On-line Seawater Radioactivity Monitoring

  • Lee, Dong-Myung;Cong, Binh Do;Lee, Jun-Ho;Yeo, In-Young;Kim, Cheol-Su
    • Journal of Radiation Protection and Research
    • /
    • v.44 no.2
    • /
    • pp.64-71
    • /
    • 2019
  • Background: It is very difficult to distinguish between a radioactive contamination source and background radiation from natural radionuclides in the marine environment by means of online monitoring system. The objective of this study was to investigate a statistical process for triggering abnormal level of count rate data measured from our on-line seawater radioactivity monitoring. Materials and Methods: Count rate data sets in time series were collected from 9 monitoring posts. All of the count rate data were measured every 15 minutes from the region of interest (ROI) for $^{137}Cs$ ($E_{\gamma}=661.6keV$) on the gamma-ray energy spectrum. The Shewhart ($3{\sigma}$), CUSUM, and Bayesian S-R control chart methods were evaluated and the comparative analysis of determination methods for count rate data was carried out in terms of the false positive incidence rate. All statistical algorithms were developed using R Programming by the authors. Results and Discussion: The $3{\sigma}$, CUSUM, and S-R analyses resulted in the average false positive incidence rate of $0.164{\pm}0.047%$, $0.064{\pm}0.0367%$, and $0.030{\pm}0.018%$, respectively. The S-R method has a lower value than that of the $3{\sigma}$ and CUSUM method, because the Bayesian S-R method use the information to evaluate a posterior distribution, even though the CUSUM control chart accumulate information from recent data points. As the result of comparison between net count rate and gross count rate measured in time series all the year at a monitoring post using the $3{\sigma}$ control charts, the two methods resulted in the false positive incidence rate of 0.142% and 0.219%, respectively. Conclusion: Bayesian S-R and CUSUM control charts are better suited for on-line seawater radioactivity monitoring with an count rate data in time series than $3{\sigma}$ control chart. However, it requires a continuous increasing trend to differentiate between a false positive and actual radioactive contamination. For the determination of count rate, the net count method is better than the gross count method because of relatively a small variation in the data points.

Evaluation of Usefulness for Diagnosis of Lung Cancer on Integrated PET-MRI Using Decision Matrix (판정행렬을 기반한 일체형 PET-MRI의 폐암 진단 유용성 평가)

  • Kim, Jung-Soo;Yang, Hyun-Jin;Kim, Yoo-Mi;Kwon, Hyeong-Jin;Park, Chanrok
    • Journal of radiological science and technology
    • /
    • v.44 no.6
    • /
    • pp.635-643
    • /
    • 2021
  • The results of empirical researches on the diagnosis of lung cancer are insufficient, so it is limited to objectively judge the clinical possibility and utilization according to the accuracy of diagnosis. Thus, this study retrospectively analyzed the lung cancer diagnostic performance of PET-MRI (Positron Emission Tomography-Magnetic Resonance Imaging) by using the decision matrix. This study selected and experimented total 165 patients who received both hematological CEA (Carcinoembryonic Antigen) test and hybrid PET-MRI (18F-FDG, 5.18 MBq/kg / Body TIM coil. VIVE-Dixon). After setting up the result of CEA (positive:>4 ㎍/ℓ. negative:<2.5㎍/ℓ) as golden data, the lung cancer was found in the image of PET-MRI, and then the SUVmax (positive:>4, negative:<1.5) was measured, and then evaluated the correlation and significance of results of relative diagnostic performance of PET-MRI compared to CEA through the statistical verification (t-test, P>0.05). Through this, the PET-MRI was analyzed as 96.29% of sensitivity, 95.23% of specificity, 3.70% of false negative rate, 4.76% of false positive rate, and 95.75% of accuracy. The false negative rate was 1.06% lower than the false positive rate. The PET-MRI that significant accuracy of diagnosis through high sensitivity and specificity, and low false negative rate and false positive rate of lung cancer, could acquire the fusion image of specialized soft tissue by combining the radio-pharmaceuticals with various sequences, so its clinical value and usefulness are regarded as latently sufficient.

An Improved Bayesian Spam Mail Filter based on Ch-square Statistics (카이제곱 통계량을 이용한 개선된 베이지안 스팸메일 필터)

  • Kim Jin-Sang;Choe Sang-Yeol
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2005.04a
    • /
    • pp.403-414
    • /
    • 2005
  • Most of the currently used spam-filters are based on a Bayesian classification technique, where some serious problems occur such as a limited precision/recall rate and the false positive error. This paper addresses a solution to the problems using a modified Bayesian classifier based on chi-square statistics. The resulting spam-filter is more accurate and flexible than traditional Bayesian spam-filters and can be a personalized one providing some parameters when the filter is teamed from training data.

  • PDF

The Design and Implementation of Anomaly Traffic Analysis System using Data Mining

  • Lee, Se-Yul;Cho, Sang-Yeop;Kim, Yong-Soo
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.8 no.4
    • /
    • pp.316-321
    • /
    • 2008
  • Advanced computer network technology enables computers to be connected in an open network environment. Despite the growing numbers of security threats to networks, most intrusion detection identifies security attacks mainly by detecting misuse using a set of rules based on past hacking patterns. This pattern matching has a high rate of false positives and can not detect new hacking patterns, which makes it vulnerable to previously unidentified attack patterns and variations in attack and increases false negatives. Intrusion detection and analysis technologies are thus required. This paper investigates the asymmetric costs of false errors to enhance the performances the detection systems. The proposed method utilizes the network model to consider the cost ratio of false errors. By comparing false positive errors with false negative errors, this scheme achieved better performance on the view point of both security and system performance objectives. The results of our empirical experiment show that the network model provides high accuracy in detection. In addition, the simulation results show that effectiveness of anomaly traffic detection is enhanced by considering the costs of false errors.

A scoring method for evaluating the reliability of protein-protein interaction data (단백질 상호작용 데이터의 신뢰도 검증 기법)

  • 홍진선;한경숙
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10b
    • /
    • pp.292-294
    • /
    • 2004
  • 단백질 상호작용 검출 방법의 발달로 많은 양의 데이터가 산출되고 있고, 이러한 상호작용 데이터의 방대한 양으로 인해 통계적 방법을 이용하여 데이터를 처리함으로서 유용한 지식을 얻을 수 있다 예측한 상호작용 데이터는 첫째, 대량의 데이터를 생산해내므로, 많은 false-positive를 내포하고 있고, 둘째, 예측한 상호작용을 검증시 실험을 하는 방법 외에는 신뢰도를 측정하기가 어렵다는 문제점이 있다. 본 연구에서는 점수 할당시스템을 사용함으로서 예측한 인간 단백질 상호작용 데이터의 false-positive를 줄이고, 각각 상호작용에 점수를 부설함으로서 상호작용 데이터의 신뢰도를 검증하는 방법을 제안하고 있다.

  • PDF

Investigation of False Positive Rates Newborn Screening using Tandem Mass Spectrometry (TMS) Technology in Single Center (단일기관에서 이중 질량 분석법(tandem mass spectrometry technology)을 이용한 선천성 대사이상 검사의 위양성율에 대한 연구)

  • Kim, Hyunsoo;Shin, Son Moon;Ko, Sun Young;Lee, Yeon Kyung;Park, Sung Won
    • Journal of The Korean Society of Inherited Metabolic disease
    • /
    • v.16 no.1
    • /
    • pp.18-23
    • /
    • 2016
  • Objective: Newborn screening leads to improved treatment and disease outcomes, but false-positive newborn screening results may impact include parental stress and anxiety, perception of child as unhealthy, parent-child relationship dysfunction, and increased infant hospitalizations. The purpose of this study was to investigate of the false positive rates and the causative factors of false positive results in Tandem Mass Spectrometry (TMS) in single center. Methods: Records were reviewed for all 18,872 subjects who were born in Cheill General Hospital, during January 1st, 2012 to December 31st, 2014. 17,292 neonates (91.62%) were tested for tandem mass screening almost in 2-5th day of life. Newborn babies whose first results were abnormal had been tested repeatedly by same methods in 7-14 day. If the results were abnormal again, further evaluation was performed. TMS analysis included data for the 43 disorders screened for using TMS broken down into three categories: fatty acid oxidation disorders, organic acidurias, and aminoacidopathies. The impact of several factors on increased false positive rates was analyzed using a multivariate analysis: time from birth to sample collection, birth weight, birth height, BMI, gender, gestational age, delivery type. Results: Males of the subjects were 8942 (51.7%), female 8350 (48.3%), the mean gestational age was $38.6{\pm}1.7$ weeks, the average birth weight $3,155.6{\pm}502.4g$, the average birth height $49.1{\pm}2.9cm$, and the average BMI $13.0{\pm}3.8(kg/m^2)$. Vaginal delivery cases were 9713 (56.2%), caesarean section 7,579 (43.8%). The average date of the inspection was $2.8{\pm}1.1$ days. 224 cases were identified as TMS positive. All the subjects were false positive (222/17,292, 1.30%) except 2 cases (1 male; benign phenylketonuria and 1 female; Short chain acyl-CoA dehydrogenase deficiency). The false positive rates were 0.61% in fatty acid oxidation disorders, 0.25% in organic acidurias, and 0.45% in aminoacidopathies. In our study, the date of inspection got late, the false positive rates got higher. Because almost the cases of late test date were in treatment in neonatal intensive care unit so their test date was affected by their medical conditions. False positive rate was higher in extreme immaturity${\leq}27$ weeks than newborns of gestational age >27 weeks [OR=6.957 (CI=1.273-38.008), p<0.025] and extremely low birth weight<1,000 g than newborns of birthweight ${\geq}1,000g$ [OR=5.616 (CI=1.134-27.820), p<0.035]. Conclusion: False positive rate of TMS was 1.30% in Cheil General Hospital. Lower gestational age and birth weight impacted on increased false positive rates. Better understanding of factors that influence the reporting of screening tests, and the ability to modify these important factors, may improve the screening process and reduce the need for retesting. of screening tests, and the ability to modify these important factors, may improve the screening process and reduce the need for retesting.

  • PDF

Comparison of Deep Learning-based CNN Models for Crack Detection (콘크리트 균열 탐지를 위한 딥 러닝 기반 CNN 모델 비교)

  • Seol, Dong-Hyeon;Oh, Ji-Hoon;Kim, Hong-Jin
    • Journal of the Architectural Institute of Korea Structure & Construction
    • /
    • v.36 no.3
    • /
    • pp.113-120
    • /
    • 2020
  • The purpose of this study is to compare the models of Deep Learning-based Convolution Neural Network(CNN) for concrete crack detection. The comparison models are AlexNet, GoogLeNet, VGG16, VGG19, ResNet-18, ResNet-50, ResNet-101, and SqueezeNet which won ImageNet Large Scale Visual Recognition Challenge(ILSVRC). To train, validate and test these models, we constructed 3000 training data and 12000 validation data with 256×256 pixel resolution consisting of cracked and non-cracked images, and constructed 5 test data with 4160×3120 pixel resolution consisting of concrete images with crack. In order to increase the efficiency of the training, transfer learning was performed by taking the weight from the pre-trained network supported by MATLAB. From the trained network, the validation data is classified into crack image and non-crack image, yielding True Positive (TP), True Negative (TN), False Positive (FP), False Negative (FN), and 6 performance indicators, False Negative Rate (FNR), False Positive Rate (FPR), Error Rate, Recall, Precision, Accuracy were calculated. The test image was scanned twice with a sliding window of 256×256 pixel resolution to classify the cracks, resulting in a crack map. From the comparison of the performance indicators and the crack map, it was concluded that VGG16 and VGG19 were the most suitable for detecting concrete cracks.

Test Bed Design of Fire Detection System Based on Multi-Sensor Information for Reduction of False Alarms (화재감지 오보 감소를 위한 다중정보기반 시스템의 Test Bed 설계)

  • Lee, Kijun;Kim, Hyeong Gweon;Lee, Bong Woo;Kim, Tae-Ok;Shin, Dongil
    • Journal of the Korean Institute of Gas
    • /
    • v.16 no.6
    • /
    • pp.107-114
    • /
    • 2012
  • Fire detection system is used for detection and alarm-generation of danger in case of fire. Most fire detection systems being used these days often malfunction from false positive and false negative errors. To improve detection reliability, an integrated fire detection algorithm using multi-senor information of heat, smoke and carbon monoxide detectors is suggested, then built and tested using the LabVIEW environment. Simulated using sensor measurement data offered by National Institute of Standards and Technology (NIST), possibility of reducing false positive and false negative errors is verified.

Intelligent Intrusion Detection Systems Using the Asymmetric costs of Errors in Data Mining (데이터 마이닝의 비대칭 오류비용을 이용한 지능형 침입탐지시스템 개발)

  • Hong, Tae-Ho;Kim, Jin-Wan
    • The Journal of Information Systems
    • /
    • v.15 no.4
    • /
    • pp.211-224
    • /
    • 2006
  • This study investigates the application of data mining techniques such as artificial neural networks, rough sets, and induction teaming to the intrusion detection systems. To maximize the effectiveness of data mining for intrusion detection systems, we introduced the asymmetric costs with false positive errors and false negative errors. And we present a method for intrusion detection systems to utilize the asymmetric costs of errors in data mining. The results of our empirical experiment show our intrusion detection model provides high accuracy in intrusion detection. In addition the approach using the asymmetric costs of errors in rough sets and neural networks is effective according to the change of threshold value. We found the threshold has most important role of intrusion detection model for decreasing the costs, which result from false negative errors.

  • PDF