• Title/Summary/Keyword: Self-Supervised Learning

Search Result 91, Processing Time 0.03 seconds

Self-Supervised Rigid Registration for Small Images

  • Ma, Ruoxin;Zhao, Shengjie;Cheng, Samuel
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.1
    • /
    • pp.180-194
    • /
    • 2021
  • For small image registration, feature-based approaches are likely to fail as feature detectors cannot detect enough feature points from low-resolution images. The classic FFT approach's prediction accuracy is high, but the registration time can be relatively long, about several seconds to register one image pair. To achieve real-time and high-precision rigid registration for small images, we apply deep neural networks for supervised rigid transformation prediction, which directly predicts the transformation parameters. We train deep registration models with rigidly transformed CIFAR-10 images and STL-10 images, and evaluate the generalization ability of deep registration models with transformed CIFAR-10 images, STL-10 images, and randomly generated images. Experimental results show that the deep registration models we propose can achieve comparable accuracy to the classic FFT approach for small CIFAR-10 images (32×32) and our LSTM registration model takes less than 1ms to register one pair of images. For moderate size STL-10 images (96×96), FFT significantly outperforms deep registration models in terms of accuracy but is also considerably slower. Our results suggest that deep registration models have competitive advantages over conventional approaches, at least for small images.

CutPaste-Based Anomaly Detection Model using Multi Scale Feature Extraction in Time Series Streaming Data

  • Jeon, Byeong-Uk;Chung, Kyungyong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.8
    • /
    • pp.2787-2800
    • /
    • 2022
  • The aging society increases emergency situations of the elderly living alone and a variety of social crimes. In order to prevent them, techniques to detect emergency situations through voice are actively researched. This study proposes CutPaste-based anomaly detection model using multi-scale feature extraction in time series streaming data. In the proposed method, an audio file is converted into a spectrogram. In this way, it is possible to use an algorithm for image data, such as CNN. After that, mutli-scale feature extraction is applied. Three images drawn from Adaptive Pooling layer that has different-sized kernels are merged. In consideration of various types of anomaly, including point anomaly, contextual anomaly, and collective anomaly, the limitations of a conventional anomaly model are improved. Finally, CutPaste-based anomaly detection is conducted. Since the model is trained through self-supervised learning, it is possible to detect a diversity of emergency situations as anomaly without labeling. Therefore, the proposed model overcomes the limitations of a conventional model that classifies only labelled emergency situations. Also, the proposed model is evaluated to have better performance than a conventional anomaly detection model.

3D Cross-Modal Retrieval Using Noisy Center Loss and SimSiam for Small Batch Training

  • Yeon-Seung Choo;Boeun Kim;Hyun-Sik Kim;Yong-Suk Park
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.3
    • /
    • pp.670-684
    • /
    • 2024
  • 3D Cross-Modal Retrieval (3DCMR) is a task that retrieves 3D objects regardless of modalities, such as images, meshes, and point clouds. One of the most prominent methods used for 3DCMR is the Cross-Modal Center Loss Function (CLF) which applies the conventional center loss strategy for 3D cross-modal search and retrieval. Since CLF is based on center loss, the center features in CLF are also susceptible to subtle changes in hyperparameters and external inferences. For instance, performance degradation is observed when the batch size is too small. Furthermore, the Mean Squared Error (MSE) used in CLF is unable to adapt to changes in batch size and is vulnerable to data variations that occur during actual inference due to the use of simple Euclidean distance between multi-modal features. To address the problems that arise from small batch training, we propose a Noisy Center Loss (NCL) method to estimate the optimal center features. In addition, we apply the simple Siamese representation learning method (SimSiam) during optimal center feature estimation to compare projected features, making the proposed method robust to changes in batch size and variations in data. As a result, the proposed approach demonstrates improved performance in ModelNet40 dataset compared to the conventional methods.

Comparative Analysis of Self-supervised Deephashing Models for Efficient Image Retrieval System (효율적인 이미지 검색 시스템을 위한 자기 감독 딥해싱 모델의 비교 분석)

  • Kim Soo In;Jeon Young Jin;Lee Sang Bum;Kim Won Gyum
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.12
    • /
    • pp.519-524
    • /
    • 2023
  • In hashing-based image retrieval, the hash code of a manipulated image is different from the original image, making it difficult to search for the same image. This paper proposes and evaluates a self-supervised deephashing model that generates perceptual hash codes from feature information such as texture, shape, and color of images. The comparison models are autoencoder-based variational inference models, but the encoder is designed with a fully connected layer, convolutional neural network, and transformer modules. The proposed model is a variational inference model that includes a SimAM module of extracting geometric patterns and positional relationships within images. The SimAM module can learn latent vectors highlighting objects or local regions through an energy function using the activation values of neurons and surrounding neurons. The proposed method is a representation learning model that can generate low-dimensional latent vectors from high-dimensional input images, and the latent vectors are binarized into distinguishable hash code. From the experimental results on public datasets such as CIFAR-10, ImageNet, and NUS-WIDE, the proposed model is superior to the comparative model and analyzed to have equivalent performance to the supervised learning-based deephashing model. The proposed model can be used in application systems that require low-dimensional representation of images, such as image search or copyright image determination.

최신 자가 학습 기반의 인공지능 기술 동향

  • Kim, Seung-Ryong
    • Broadcasting and Media Magazine
    • /
    • v.27 no.2
    • /
    • pp.19-25
    • /
    • 2022
  • 본 고에서는 최근 컴퓨터 비전 분야에서 가장 활발히 연구되고 있는 분야 중에 하나인 자가 학습(Self-supervised Learning) 기술의 동향과 향후 방향성에 대해서 논의한다. 컴퓨터 비전 분야에서의 자가 학습 기술은 최근에 Contrastive Learning 기법을 활용하여 활발하게 연구되고 있는데, 이를 위한 좋은 Positive와 Negative를 어떻게 추출할까에 대한 고민으로 수많은 연구들이 진행되어 왔다. 본 고에서는 이러한 방향성에서 대표적인 몇 가지의 방법론에 대해서 논의하고 이의 한계점을 언급하며 컴퓨터 비전 분야에서 자가 학습 기법이 가야 할 방향성에 대해서 논의하고자 한다.

Pipeline Structural Damage Detection Using Self-Sensing Technology and PNN-Based Pattern Recognition (자율 감지 및 확률론적 신경망 기반 패턴 인식을 이용한 배관 구조물 손상 진단 기법)

  • Lee, Chang-Gil;Park, Woong-Ki;Park, Seung-Hee
    • Journal of the Korean Society for Nondestructive Testing
    • /
    • v.31 no.4
    • /
    • pp.351-359
    • /
    • 2011
  • In a structure, damage can occur at several scales from micro-cracking to corrosion or loose bolts. This makes the identification of damage difficult with one mode of sensing. Hence, a multi-mode actuated sensing system is proposed based on a self-sensing circuit using a piezoelectric sensor. In the self sensing-based multi-mode actuated sensing, one mode provides a wide frequency-band structural response from the self-sensed impedance measurement and the other mode provides a specific frequency-induced structural wavelet response from the self-sensed guided wave measurement. In this study, an experimental study on the pipeline system is carried out to verify the effectiveness and the robustness of the proposed structural health monitoring approach. Different types of structural damage are artificially inflicted on the pipeline system. To classify the multiple types of structural damage, a supervised learning-based statistical pattern recognition is implemented by composing a two-dimensional space using the damage indices extracted from the impedance and guided wave features. For more systematic damage classification, several control parameters to determine an optimal decision boundary for the supervised learning-based pattern recognition are optimized. Finally, further research issues will be discussed for real-world implementation of the proposed approach.

Damage Detecion of CFRP-Laminated Concrete based on a Continuous Self-Sensing Technology (셀프센싱 상시계측 기반 CFRP보강 콘크리트 구조물의 손상검색)

  • Kim, Young-Jin;Park, Seung-Hee;Jin, Kyu-Nam;Lee, Chang-Gil
    • Land and Housing Review
    • /
    • v.2 no.4
    • /
    • pp.407-413
    • /
    • 2011
  • This paper reports a novel structural health monitoring (SHM) technique for detecting de-bonding between a concrete beam and CFRP (Carbon Fiber Reinforced Polymer) sheet that is attached to the concrete surface. To achieve this, a multi-scale actuated sensing system with a self-sensing circuit using piezoelectric active sensors is applied to the CFRP laminated concrete beam structure. In this self-sensing based multi-scale actuated sensing, one scale provides a wide frequency-band structural response from the self-sensed impedance measurements and the other scale provides a specific frequency-induced structural wavelet response from the self-sensed guided wave measurement. To quantify the de-bonding levels, the supervised learning-based statistical pattern recognition was implemented by composing a two-dimensional (2D) plane using the damage indices extracted from the impedance and guided wave features.

Wifi Fingerprint Calibration Using Semi-Supervised Self Organizing Map (반지도식 자기조직화지도를 이용한 wifi fingerprint 보정 방법)

  • Thai, Quang Tung;Chung, Ki-Sook;Keum, Changsup
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.42 no.2
    • /
    • pp.536-544
    • /
    • 2017
  • Wireless RSSI (Received Signal Strength Indication) fingerprinting is one of the most popular methods for indoor positioning as it provides reasonable accuracy while being able to exploit existing wireless infrastructure. However, the process of radio map construction (aka fingerprint calibration) is laborious and time consuming as precise physical coordinates and wireless signals have to be measured at multiple locations of target environment. This paper proposes a method to build the map from a combination of RSSIs without location information collected in a crowdsourcing fashion, and a handful of labeled RSSIs using a semi-supervised self organizing map learning algorithm. Experiment on simulated data shows promising results as the method is able to recover the full map effectively with only 1% RSSI samples from the fingerprint database.

Current Trend and Direction of Deep Learning Method to Railroad Defect Detection and Inspection

  • Han, Seokmin
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.3
    • /
    • pp.149-154
    • /
    • 2022
  • In recent years, the application of deep learning method to computer vision has shown to achieve great performances. Thus, many research projects have also applied deep learning technology to railroad defect detection. In this paper, we have reviewed the researches that applied computer vision based deep learning method to railroad defect detection and inspection, and have discussed the current trend and the direction of those researches. Many research projects were targeted to operate automatically without visual inspection of human and to work in real-time. Therefore, methods to speed up the computation were also investigated. The reduction of the number of learning parameters was considered important to improve computation efficiency. In addition to computation speed issue, the problem of annotation was also discussed in some research projects. To alleviate the problem of time consuming annotation, some kinds of automatic segmentation of the railroad defect or self-supervised methods have been suggested.

Intelligent Hybrid Fusion Algorithm with Vision Patterns for Generation of Precise Digital Road Maps in Self-driving Vehicles

  • Jung, Juho;Park, Manbok;Cho, Kuk;Mun, Cheol;Ahn, Junho
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.10
    • /
    • pp.3955-3971
    • /
    • 2020
  • Due to the significant increase in the use of autonomous car technology, it is essential to integrate this technology with high-precision digital map data containing more precise and accurate roadway information, as compared to existing conventional map resources, to ensure the safety of self-driving operations. While existing map technologies may assist vehicles in identifying their locations via Global Positioning System, it is however difficult to update the environmental changes of roadways in these maps. Roadway vision algorithms can be useful for building autonomous vehicles that can avoid accidents and detect real-time location changes. We incorporate a hybrid architectural design that combines unsupervised classification of vision data with supervised joint fusion classification to achieve a better noise-resistant algorithm. We identify, via a deep learning approach, an intelligent hybrid fusion algorithm for fusing multimodal vision feature data for roadway classifications and characterize its improvement in accuracy over unsupervised identifications using image processing and supervised vision classifiers. We analyzed over 93,000 vision frame data collected from a test vehicle in real roadways. The performance indicators of the proposed hybrid fusion algorithm are successfully evaluated for the generation of roadway digital maps for autonomous vehicles, with a recall of 0.94, precision of 0.96, and accuracy of 0.92.