• Title/Summary/Keyword: datasets

Search Result 2,081, Processing Time 0.023 seconds

Presentation Attacks in Palmprint Recognition Systems

  • Sun, Yue;Wang, Changkun
    • Journal of Multimedia Information System
    • /
    • v.9 no.2
    • /
    • pp.103-112
    • /
    • 2022
  • Background: A presentation attack places the printed image or displayed video at the front of the sensor to deceive the biometric recognition system. Usually, presentation attackers steal a genuine user's biometric image and use it for presentation attack. In recent years, reconstruction attack and adversarial attack can generate high-quality fake images, and have high attack success rates. However, their attack rates degrade remarkably after image shooting. Methods: In order to comprehensively analyze the threat of presentation attack to palmprint recognition system, this paper makes six palmprint presentation attack datasets. The datasets were tested on texture coding-based recognition methods and deep learning-based recognition methods. Results and conclusion: The experimental results show that the presentation attack caused by the leakage of the original image has a high success rate and a great threat; while the success rates of reconstruction attack and adversarial attack decrease significantly.

A Study on the Classification Model of Minhwa Genre Based on Deep Learning (딥러닝 기반 민화 장르 분류 모델 연구)

  • Yoon, Soorim;Lee, Young-Suk
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.10
    • /
    • pp.1524-1534
    • /
    • 2022
  • This study proposes the classification model of Minhwa genre based on object detection of deep learning. To detect unique Korean traditional objects in Minhwa, we construct custom datasets by labeling images using object keywords in Minhwa DB. We train YOLOv5 models with custom datasets, and classify images using predicted object labels result, the output of model training. The algorithm consists of two classification steps: 1) according to the painting technique and 2) genre of Minhwa. Through classifying paintings using this algorithm on the Internet, it is expected that the correct information of Minhwa can be built and provided to users forward.

Integrative Multi-Omics Approaches in Cancer Research: From Biological Networks to Clinical Subtypes

  • Heo, Yong Jin;Hwa, Chanwoong;Lee, Gang-Hee;Park, Jae-Min;An, Joon-Yong
    • Molecules and Cells
    • /
    • v.44 no.7
    • /
    • pp.433-443
    • /
    • 2021
  • Multi-omics approaches are novel frameworks that integrate multiple omics datasets generated from the same patients to better understand the molecular and clinical features of cancers. A wide range of emerging omics and multi-view clustering algorithms now provide unprecedented opportunities to further classify cancers into subtypes, improve the survival prediction and therapeutic outcome of these subtypes, and understand key pathophysiological processes through different molecular layers. In this review, we overview the concept and rationale of multi-omics approaches in cancer research. We also introduce recent advances in the development of multi-omics algorithms and integration methods for multiple-layered datasets from cancer patients. Finally, we summarize the latest findings from large-scale multi-omics studies of various cancers and their implications for patient subtyping and drug development.

A Study of Real-time Semantic Segmentation Performance Improvement in Unstructured Outdoor Environment (비정형 야지환경 주행상황에서의 실시간 의미론적 영상 분할 알고리즘 성능 향상에 관한 연구)

  • Daeyoung, Kim;Seunguk, Ahn;Seung-Woo, Seo
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.25 no.6
    • /
    • pp.606-616
    • /
    • 2022
  • Semantic segmentation in autonomous driving for unstructured environments is challenging due to the presence of uneven terrains, unstructured class boundaries, irregular features and strong textures. Current off-road datasets exhibit difficulties like class imbalance and understanding of varying environmental topography. To overcome these issues, we propose a deep learning framework for semantic segmentation that involves a pooled class semantic segmentation with five classes. The evaluation of the framework is carried out on two off-road driving datasets, RUGD and TAS500. The results show that our proposed method achieves high accuracy and real-time performance.

Density-based Outlier Detection in Multi-dimensional Datasets

  • Wang, Xite;Cao, Zhixin;Zhan, Rongjuan;Bai, Mei;Ma, Qian;Li, Guanyu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.12
    • /
    • pp.3815-3835
    • /
    • 2022
  • Density-based outlier detection is one of the hot issues in data mining. A point is determined as outlier on basis of the density of points near them. The existing density-based detection algorithms have high time complexity, in order to reduce the time complexity, a new outlier detection algorithm DODMD (Density-based Outlier Detection in Multidimensional Datasets) is proposed. Firstly, on the basis of ZH-tree, the concept of micro-cluster is introduced. Each leaf node is regarded as a micro-cluster, and the micro-cluster is calculated to achieve the purpose of batch filtering. In order to obtain n sets of approximate outliers quickly, a greedy method is used to calculate the boundary of LOF and mark the minimum value as LOFmin. Secondly, the outliers can filtered out by LOFmin, the real outliers are calculated, and then the result set is updated to make the boundary closer. Finally, the accuracy and efficiency of DODMD algorithm are verified on real dataset and synthetic dataset respectively.

Performance Improvement of Fuzzy C-Means Clustering Algorithm by Optimized Early Stopping for Inhomogeneous Datasets

  • Chae-Rim Han;Sun-Jin Lee;Il-Gu Lee
    • Journal of information and communication convergence engineering
    • /
    • v.21 no.3
    • /
    • pp.198-207
    • /
    • 2023
  • Responding to changes in artificial intelligence models and the data environment is crucial for increasing data-learning accuracy and inference stability of industrial applications. A learning model that is overfitted to specific training data leads to poor learning performance and a deterioration in flexibility. Therefore, an early stopping technique is used to stop learning at an appropriate time. However, this technique does not consider the homogeneity and independence of the data collected by heterogeneous nodes in a differential network environment, thus resulting in low learning accuracy and degradation of system performance. In this study, the generalization performance of neural networks is maximized, whereas the effect of the homogeneity of datasets is minimized by achieving an accuracy of 99.7%. This corresponds to a decrease in delay time by a factor of 2.33 and improvement in performance by a factor of 2.5 compared with the conventional method.

A surrogate model for the helium production rate in fast reactor MOX fuels

  • D. Pizzocri;M.G. Katsampiris;L. Luzzi;A. Magni;G. Zullo
    • Nuclear Engineering and Technology
    • /
    • v.55 no.8
    • /
    • pp.3071-3079
    • /
    • 2023
  • Helium production in the nuclear fuel matrix during irradiation plays a critical role in the design and performance of Gen-IV reactor fuel, as it represents a life-limiting factor for the operation of fuel pins. In this work, a surrogate model for the helium production rate in fast reactor MOX fuels is developed, targeting its inclusion in engineering tools such as fuel performance codes. This surrogate model is based on synthetic datasets obtained via the SCIANTIX burnup module. Such datasets are generated using Latin hypercube sampling to cover the range of input parameters (e.g., fuel initial composition, fission rate density, and irradiation time) and exploiting the low computation requirement of the burnup module itself. The surrogate model is verified against the SCIANTIX burnup module results for helium production with satisfactory performance.

Saliency-Assisted Collaborative Learning Network for Road Scene Semantic Segmentation

  • Haifeng Sima;Yushuang Xu;Minmin Du;Meng Gao;Jing Wang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.3
    • /
    • pp.861-880
    • /
    • 2023
  • Semantic segmentation of road scene is the key technology of autonomous driving, and the improvement of convolutional neural network architecture promotes the improvement of model segmentation performance. The existing convolutional neural network has the simplification of learning knowledge and the complexity of the model. To address this issue, we proposed a road scene semantic segmentation algorithm based on multi-task collaborative learning. Firstly, a depthwise separable convolution atrous spatial pyramid pooling is proposed to reduce model complexity. Secondly, a collaborative learning framework is proposed involved with saliency detection, and the joint loss function is defined using homoscedastic uncertainty to meet the new learning model. Experiments are conducted on the road and nature scenes datasets. The proposed method achieves 70.94% and 64.90% mIoU on Cityscapes and PASCAL VOC 2012 datasets, respectively. Qualitatively, Compared to methods with excellent performance, the method proposed in this paper has significant advantages in the segmentation of fine targets and boundaries.

Data augmentation technique based on image binarization for constructing large-scale datasets (대형 이미지 데이터셋 구축을 위한 이미지 이진화 기반 데이터 증강 기법)

  • Lee JuHyeok;Kim Mi Hui
    • Journal of IKEEE
    • /
    • v.27 no.1
    • /
    • pp.59-64
    • /
    • 2023
  • Deep learning can solve various computer vision problems, but it requires a large dataset. Data augmentation technique based on image binarization for constructing large-scale datasets is proposed in this paper. By extracting features using image binarization and randomly placing the remaining pixels, new images are generated. The generated images showed similar quality to the original images and demonstrated excellent performance in deep learning models.

Recent deep learning methods for tabular data

  • Yejin Hwang;Jongwoo Song
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.2
    • /
    • pp.215-226
    • /
    • 2023
  • Deep learning has made great strides in the field of unstructured data such as text, images, and audio. However, in the case of tabular data analysis, machine learning algorithms such as ensemble methods are still better than deep learning. To keep up with the performance of machine learning algorithms with good predictive power, several deep learning methods for tabular data have been proposed recently. In this paper, we review the latest deep learning models for tabular data and compare the performances of these models using several datasets. In addition, we also compare the latest boosting methods to these deep learning methods and suggest the guidelines to the users, who analyze tabular datasets. In regression, machine learning methods are better than deep learning methods. But for the classification problems, deep learning methods perform better than the machine learning methods in some cases.