• Title/Summary/Keyword: Reference dataset

Search Result 117, Processing Time 0.034 seconds

Standardized Breast Cancer Mortality Rate Compared to the General Female Population of Iran

  • Haghighat, S.;Akbari, M.E.;Ghaffari, S.;Yavari, P.
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.13 no.11
    • /
    • pp.5525-5528
    • /
    • 2012
  • Introduction: Breast cancer is the most common cancer in women. Improvements of early diagnosis modalities have led to longer survival rates. This study aimed to determine the 5, 10 and 15 year mortality rates of breast cancer patients compared to the normal female population. Materials and Methods: The follow up data of a cohort of 615 breast cancer patients referred to Iranian Breast Cancer Research Center (BCRC) from 1986 to 1996 was considered as reference breast cancer dataset. The dataset was divided into 5 year age groups and the 5, 10 and 15 year probability of death for each group was estimated. The annual mortality rate of Iranian women was obtained from the Death Registry system. Standardized mortality ratios (SMRs) of breast cancer patients were calculated using the ratio of the mortality rate in breast cancer patients over the general female population. Results: The mean age of breast cancer patients at diagnosis time was 45.9 (${\pm}10.5$) years ranging from 24-74. A total of 73, 32 and 2 deaths were recorded at 5, 10 and 15 years, respectively, after diagnosis. The SMRs for breast cancer patients at 5, 10 and 15 year intervals after diagnosis were 6.74 (95% CI, 5.5-8.2), 6.55 (95%CI, 5-8.1) and 1.26 (95%CI, 0.65-2.9), respectively. Conclusion: Results showed that the observed mortality rate of breast cancer patients after 15 years from diagnosis was very similar to expected rates in general female population. This finding would be useful for clinicians and health policy makers to adopt a beneficial strategy to improve breast cancer survival. Further follow-up time with larger sample size and a pooled analysis of survival rates of different centres may shed more light on mortality patterns of breast cancer.

FAFS: A Fuzzy Association Feature Selection Method for Network Malicious Traffic Detection

  • Feng, Yongxin;Kang, Yingyun;Zhang, Hao;Zhang, Wenbo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.1
    • /
    • pp.240-259
    • /
    • 2020
  • Analyzing network traffic is the basis of dealing with network security issues. Most of the network security systems depend on the feature selection of network traffic data and the detection ability of malicious traffic in network can be improved by the correct method of feature selection. An FAFS method, which is short for Fuzzy Association Feature Selection method, is proposed in this paper for network malicious traffic detection. Association rules, which can reflect the relationship among different characteristic attributes of network traffic data, are mined by association analysis. The membership value of association rules are obtained by the calculation of fuzzy reasoning. The data features with the highest correlation intensity in network data sets are calculated by comparing the membership values in association rules. The dimension of data features are reduced and the detection ability of malicious traffic detection algorithm in network is improved by FAFS method. To verify the effect of malicious traffic feature selection by FAFS method, FAFS method is used to select data features of different dataset in this paper. Then, K-Nearest Neighbor algorithm, C4.5 Decision Tree algorithm and Naïve Bayes algorithm are used to test on the dataset above. Moreover, FAFS method is also compared with classical feature selection methods. The analysis of experimental results show that the precision and recall rate of malicious traffic detection in the network can be significantly improved by FAFS method, which provides a valuable reference for the establishment of network security system.

Detection and Correction of Noisy Pixels Embedded in NDVI Time Series Based on the Spatio-temporal Continuity (시공간적 연속성을 이용한 오염된 식생지수(GIMMS NDVI) 화소의 탐지 및 보정 기법 개발)

  • Park, Ju-Hee;Cho, A-Ra;Kang, Jeon-Ho;Suh, Myoung-Seok
    • Atmosphere
    • /
    • v.21 no.4
    • /
    • pp.337-347
    • /
    • 2011
  • In this paper, we developed a detection and correction method of noisy pixels embedded in the time series of normalized difference vegetation index (NDVI) data based on the spatio-temporal continuity of vegetation conditions. For the application of the method, 25-year (1982-2006) GIMMS (Global Inventory Modeling and Mapping Study) NDVI dataset over the Korean peninsula were used. The spatial resolution and temporal frequency of this dataset are $8{\times}8km^2$ and 15-day, respectively. Also the land cover map over East Asia is used. The noisy pixels are detected by the temporal continuity check with the reference values and dynamic threshold values according to season and location. In general, the number of noisy pixels are especially larger during summer than other seasons. And the detected noisy pixels are corrected by the iterative method until the noisy pixels are completely corrected. At first, the noisy pixels are replaced by the arithmetic weighted mean of two adjacent NDVIs when the two NDVI are normal. After that the remnant noisy pixels are corrected by the weighted average of NDVI of the same land cover according to the distance. After correction, the NDVI values and their variances are increased and decreased by 5% and 50%, respectively. Comparing to the other correction method, this correction method shows a better result especially when the noisy pixels are occurred more than 2 times consistently and the temporal change rates of NDVI are very high. It means that the correction method developed in this study is superior in the reconstruction of maximum NDVI and NDVI at the starting and falling season.

Plants Disease Phenotyping using Quinary Patterns as Texture Descriptor

  • Ahmad, Wakeel;Shah, S.M. Adnan;Irtaza, Aun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.8
    • /
    • pp.3312-3327
    • /
    • 2020
  • Plant diseases are a significant yield and quality constraint for farmers around the world due to their severe impact on agricultural productivity. Such losses can have a substantial impact on the economy which causes a reduction in farmer's income and higher prices for consumers. Further, it may also result in a severe shortage of food ensuing violent hunger and starvation, especially, in less-developed countries where access to disease prevention methods is limited. This research presents an investigation of Directional Local Quinary Patterns (DLQP) as a feature descriptor for plants leaf disease detection and Support Vector Machine (SVM) as a classifier. The DLQP as a feature descriptor is specifically the first time being used for disease detection in horticulture. DLQP provides directional edge information attending the reference pixel with its neighboring pixel value by involving computation of their grey-level difference based on quinary value (-2, -1, 0, 1, 2) in 0°, 45°, 90°, and 135° directions of selected window of plant leaf image. To assess the robustness of DLQP as a texture descriptor we used a research-oriented Plant Village dataset of Tomato plant (3,900 leaf images) comprising of 6 diseased classes, Potato plant (1,526 leaf images) and Apple plant (2,600 leaf images) comprising of 3 diseased classes. The accuracies of 95.6%, 96.2% and 97.8% for the above-mentioned crops, respectively, were achieved which are higher in comparison with classification on the same dataset using other standard feature descriptors like Local Binary Pattern (LBP) and Local Ternary Patterns (LTP). Further, the effectiveness of the proposed method is proven by comparing it with existing algorithms for plant disease phenotyping.

Experimental Analysis of Recent Works on the Overlap Phase of De Novo Sequence Assembly (De novo 시퀀스 어셈블리의 overlap 단계의 최근 연구 실험 분석)

  • Lim, Jihyuk;Kim, Sun;Park, Kunsoo
    • Journal of KIISE
    • /
    • v.45 no.3
    • /
    • pp.200-210
    • /
    • 2018
  • Given a set of DNA read sequences, de novo sequence assembly reconstructs a target sequence without a reference sequence. For reconstruction, the assembly needs the overlap phase, which computes all overlaps between every pair of reads. Since the overlap phase is the most time-consuming part of the whole assembly, the performance of the assembly depends on that of the overlap phase. There have been extensive studies on the overlap phase in various fields. Among them, three state-of-the-art results for the overlap phase are Readjoiner, SOF, and Lim-Park algorithm. Recently, a rapid development of sequencing technology has made it possible to produce a large read dataset at a low cost, and many platforms for generating a DNA read dataset have been developed. Since the platforms produce datasets with different statistical characteristics, a performance evaluation for the overlap phase should consider datasets with these characteristics. In this paper, we compare and analyze the performances of the three algorithms with various large datasets.

External Validation of a Gastric Cancer Nomogram Derived from a Large-volume Center Using Dataset from a Medium-volume Center

  • Kim, Pyeong Su;Lee, Kyung-Muk;Han, Dong-Seok;Yoo, Moon-Won;Han, Hye Seung;Yang, Han-Kwang;Bang, Ho Yoon
    • Journal of Gastric Cancer
    • /
    • v.17 no.3
    • /
    • pp.204-211
    • /
    • 2017
  • Purpose: Recently, a nomogram predicting overall survival after gastric resection was developed and externally validated in Korea and Japan. However, this gastric cancer nomogram is derived from large-volume centers, and the applicability of the nomogram in smaller centers must be proven. The purpose of this study is to externally validate the gastric cancer nomogram using a dataset from a medium-volume center in Korea. Materials and Methods: We retrospectively analyzed 610 patients who underwent radical gastrectomy for gastric cancer from August 1, 2005 to December 31, 2011. Age, sex, number of metastatic lymph nodes (LNs), number of examined LNs, depth of invasion, and location of the tumor were investigated as variables for validation of the nomogram. Both discrimination and calibration of the nomogram were evaluated. Results: The discrimination was evaluated using Harrell's C-index. The Harrell's C-index was 0.83 and the discrimination of the gastric cancer nomogram was appropriate. Regarding calibration, the 95% confidence interval of predicted survival appeared to be on the ideal reference line except in the poorest survival group. However, we observed a tendency for actual survival to be constantly higher than predicted survival in this cohort. Conclusions: Although the discrimination power was good, actual survival was slightly higher than that predicted by the nomogram. This phenomenon might be explained by elongated life span in the recent patient cohort due to advances in adjuvant chemotherapy and improved nutritional status. Future gastric cancer nomograms should consider elongated life span with the passage of time.

A Bibliometric Approach for Department-Level Disciplinary Analysis and Science Mapping of Research Output Using Multiple Classification Schemes

  • Gautam, Pitambar
    • Journal of Contemporary Eastern Asia
    • /
    • v.18 no.1
    • /
    • pp.7-29
    • /
    • 2019
  • This study describes an approach for comparative bibliometric analysis of scientific publications related to (i) individual or several departments comprising a university, and (ii) broader integrated subject areas using multiple disciplinary schemes. It uses a custom dataset of scientific publications (ca. 15,000 articles and reviews, published during 2009-2013, and recorded in the Web of Science Core Collections) with author affiliations to the research departments, dedicated to science, technology, engineering, mathematics, and medicine (STEMM), of a comprehensive university. The dataset was subjected, at first, to the department level and discipline level analyses using the newly available KAKEN-L3 classification (based on MEXT/JSPS Grants-in-Aid system), hierarchical clustering, correspondence analysis to decipher the major departmental and disciplinary clusters, and visualization of the department-discipline relationships using two-dimensional stacked bar diagrams. The next step involved the creation of subsets covering integrated subject areas and a comparative analysis of departmental contributions to a specific area (medical, health and life science) using several disciplinary schemes: Essential Science Indicators (ESI) 22 research fields, SCOPUS 27 subject areas, OECD Frascati 38 subordinate research fields, and KAKEN-L3 66 subject categories. To illustrate the effective use of the science mapping techniques, the same subset for medical, health and life science area was subjected to network analyses for co-occurrences of keywords, bibliographic coupling of the publication sources, and co-citation of sources in the reference lists. The science mapping approach demonstrates the ways to extract information on the prolific research themes, the most frequently used journals for publishing research findings, and the knowledge base underlying the research activities covered by the publications concerned.

Dynamic characteristics monitoring of wind turbine blades based on improved YOLOv5 deep learning model

  • W.H. Zhao;W.R. Li;M.H. Yang;N. Hong;Y.F. Du
    • Smart Structures and Systems
    • /
    • v.31 no.5
    • /
    • pp.469-483
    • /
    • 2023
  • The dynamic characteristics of wind turbine blades are usually monitored by contact sensors with the disadvantages of high cost, difficult installation, easy damage to the structure, and difficult signal transmission. In view of the above problems, based on computer vision technology and the improved YOLOv5 (You Only Look Once v5) deep learning model, a non-contact dynamic characteristic monitoring method for wind turbine blade is proposed. First, the original YOLOv5l model of the CSP (Cross Stage Partial) structure is improved by introducing the CSP2_2 structure, which reduce the number of residual components to better the network training speed. On this basis, combined with the Deep sort algorithm, the accuracy of structural displacement monitoring is mended. Secondly, for the disadvantage that the deep learning sample dataset is difficult to collect, the blender software is used to model the wind turbine structure with conditions, illuminations and other practical engineering similar environments changed. In addition, incorporated with the image expansion technology, a modeling-based dataset augmentation method is proposed. Finally, the feasibility of the proposed algorithm is verified by experiments followed by the analytical procedure about the influence of YOLOv5 models, lighting conditions and angles on the recognition results. The results show that the improved YOLOv5 deep learning model not only perform well compared with many other YOLOv5 models, but also has high accuracy in vibration monitoring in different environments. The method can accurately identify the dynamic characteristics of wind turbine blades, and therefore can provide a reference for evaluating the condition of wind turbine blades.

Total Intracranial Volume Measurement for Children by Using an Automatized Program (자동화 프로그램을 이용한 아동의 전체두개강내용적 평가)

  • Lee, Jeonghwan;Kim, Ji-Eun;Im, Sungjin;Ju, Gawon;Kim, Siekyeong;Son, Jung-Woo;Shin, Chul-Jin;Lee, Sang-Ick;Ghim, Hei-Rhee
    • Korean Journal of Biological Psychiatry
    • /
    • v.21 no.3
    • /
    • pp.81-86
    • /
    • 2014
  • Objectives Total intracranial volume (TIV) is a major nuisance of neuroimaging research for interindividual differences of brain structure and function. Authors intended to prove the reliability of the atlas scaling factor (ASF) method for TIV estimation in FreeSurfer by comparing it with the results of manual tracing as reference method. Methods The TIVs of 26 normal children and 26 children with attention-deficit hyperactivity disorder (ADHD) were obtained by using FreeSurfer reconstruction and manual tracing with T1-weighted images. Manual tracing performed in every 10th slice of MRI dataset from midline of sagittal plane by one researcher who was blinded from clinical data. Another reseacher performed manual tracing independently for randomly selected 20 dataset to verify interrater reliability. Results The interrater reliability was excellent (intraclass coefficient = 0.91, p < 7.1e-07). There were no significant differences of age and gender distribution between normal and ADHD groups. No significant differences were found between TIVs from ASF method and manual tracing. Strong correlation between TIVs from 2 different methods were shown (r = 0.90, p < 2.2e-16). Conclusions The ASF method for TIV estimation by using FreeSurfer showed good agreement with the reference method. We can use the TIV from ASF method for correction in analysis of structural and functional neuroimaging studies with not only elderly subjects but also children, even with ADHD.

The Integrational Operation Method for the Modeling of the Pan Evaporation and the Alfalfa Reference Evapotranspiration (증발접시 증발량과 알팔파 기준증발산량의 모형화를 위한 통합운영방법)

  • Kim, Sungwon;Kim, Hung Soo
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.28 no.2B
    • /
    • pp.199-213
    • /
    • 2008
  • The goal of this research is to develop and apply the integrational operation method (IOM) for the modeling of the monthly pan evaporation (PE) and the alfalfa reference evapotranspiration ($ET_r$). Since the observed data of the alfalfa $ET_r$ using lysimeter have not been measured for a long time in Republic of Korea, Penman-Monteith (PM) method is used to estimate the observed alfalfa $ET_r$. The IOM consists of the application of the stochastic and neural networks models, respectively. The stochastic model is applied to generate the training dataset for the monthly PE and the alfalfa $ET_r$, and the neural networks models are applied to calculate the observed test dataset reasonably. Among the considered six training patterns, 1,000/PARMA(1,1)/GRNNM-GA training pattern can evaluate the suggested climatic variables very well and also construct the reliable data for the monthly PE and the alfalfa $ET_r$. Uncertainty analysis is used to eliminate the climatic variables of input nodes from 1,000/PARMA(1,1)/GRNNM-GA training pattern. The sensitive and insensitive climatic variables are chosen from the uncertainty analysis of the input nodes. Finally, it can be to model the monthly PE and the alfalfa $ET_r$ simultaneously with the least cost and endeavor using the IOM.