• Title/Summary/Keyword: Image Feature

Search Result 3,584, Processing Time 0.035 seconds

2D-MELPP: A two dimensional matrix exponential based extension of locality preserving projections for dimensional reduction

  • Xiong, Zixun;Wan, Minghua;Xue, Rui;Yang, Guowei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.9
    • /
    • pp.2991-3007
    • /
    • 2022
  • Two dimensional locality preserving projections (2D-LPP) is an improved algorithm of 2D image to solve the small sample size (SSS) problems which locality preserving projections (LPP) meets. It's able to find the low dimension manifold mapping that not only preserves local information but also detects manifold embedded in original data spaces. However, 2D-LPP is simple and elegant. So, inspired by the comparison experiments between two dimensional linear discriminant analysis (2D-LDA) and linear discriminant analysis (LDA) which indicated that matrix based methods don't always perform better even when training samples are limited, we surmise 2D-LPP may meet the same limitation as 2D-LDA and propose a novel matrix exponential method to enhance the performance of 2D-LPP. 2D-MELPP is equivalent to employing distance diffusion mapping to transform original images into a new space, and margins between labels are broadened, which is beneficial for solving classification problems. Nonetheless, the computational time complexity of 2D-MELPP is extremely high. In this paper, we replace some of matrix multiplications with multiple multiplications to save the memory cost and provide an efficient way for solving 2D-MELPP. We test it on public databases: random 3D data set, ORL, AR face database and Polyu Palmprint database and compare it with other 2D methods like 2D-LDA, 2D-LPP and 1D methods like LPP and exponential locality preserving projections (ELPP), finding it outperforms than others in recognition accuracy. We also compare different dimensions of projection vector and record the cost time on the ORL, AR face database and Polyu Palmprint database. The experiment results above proves that our advanced algorithm has a better performance on 3 independent public databases.

Degradation Quantification Method and Degradation and Creep Life Prediction Method for Nickel-Based Superalloys Based on Bayesian Inference (베이지안 추론 기반 니켈기 초합금의 열화도 정량화 방법과 열화도 및 크리프 수명 예측의 방법)

  • Junsang, Yu;Hayoung, Oh
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.27 no.1
    • /
    • pp.15-26
    • /
    • 2023
  • The purpose of this study is to determine the artificial intelligence-based degradation index from the image of the cross-section of the microstructure taken with a scanning electron microscope of the specimen obtained by the creep test of DA-5161 SX, a nickel-based superalloy used as a material for high-temperature parts. It proposes a new method of quantification and proposes a model that predicts degradation based on Bayesian inference without destroying components of high-temperature parts of operating equipment and a creep life prediction model that predicts Larson-Miller Parameter (LMP). It is proposed that the new degradation indexing method that infers a consistent representative value from a small amount of images based on the geometrical characteristics of the gamma prime phase, a nickel-base superalloy microstructure, and the prediction method of degradation index and LMP with information on the environmental conditions of the material without destroying high-temperature parts.

Development of Fast Posture Classification System for Table Tennis Robot (탁구 로봇을 위한 빠른 자세 분류 시스템 개발)

  • Jin, Seongho;Kwon, Yongwoo;Kim, Yoonjeong;Park, Miyoung;An, Jaehoon;Kang, Hosun;Choi, Jiwook;Lee, Inho
    • The Journal of Korea Robotics Society
    • /
    • v.17 no.4
    • /
    • pp.463-476
    • /
    • 2022
  • In this paper, we propose a table tennis posture classification system using a cooperative robot to develop a table tennis robot that can be trained like a real game. The most ideal table tennis robot would be a robot with a high joint driving speed and a high degree of freedom. Therefore, in this paper, we intend to use a cooperative robot with sufficient degrees of freedom to develop a robot that can be trained like a real game. However, cooperative robots have the disadvantage of slow joint driving speed. These shortcomings are expected to be overcome through quick recognition. Therefore, in this paper, we try to quickly classify the opponent's posture to overcome the slow joint driving speed. To this end, learning about dynamic postures was conducted using image data as input, and finally, three classification models were created and comparative experiments and evaluations were performed on the designated dynamic postures. In conclusion, comparative experimental data demonstrate the highest classification accuracy and fastest classification speed in classification models using MLP (Multi-Layer Perceptron), and thus demonstrate the validity of the proposed algorithm.

A Radiomics-based Unread Cervical Imaging Classification Algorithm (자궁경부 영상에서의 라디오믹스 기반 판독 불가 영상 분류 알고리즘 연구)

  • Kim, Go Eun;Kim, Young Jae;Ju, Woong;Nam, Kyehyun;Kim, Soonyung;Kim, Kwang Gi
    • Journal of Biomedical Engineering Research
    • /
    • v.42 no.5
    • /
    • pp.241-249
    • /
    • 2021
  • Recently, artificial intelligence for diagnosis system of obstetric diseases have been actively studied. Artificial intelligence diagnostic assist systems, which support medical diagnosis benefits of efficiency and accuracy, may experience problems of poor learning accuracy and reliability when inappropriate images are the model's input data. For this reason, before learning, We proposed an algorithm to exclude unread cervical imaging. 2,000 images of read cervical imaging and 257 images of unread cervical imaging were used for this study. Experiments were conducted based on the statistical method Radiomics to extract feature values of the entire images for classification of unread images from the entire images and to obtain a range of read threshold values. The degree to which brightness, blur, and cervical regions were photographed adequately in the image was determined as classification indicators. We compared the classification performance by learning read cervical imaging classified by the algorithm proposed in this paper and unread cervical imaging for deep learning classification model. We evaluate the classification accuracy for unread Cervical imaging of the algorithm by comparing the performance. Images for the algorithm showed higher accuracy of 91.6% on average. It is expected that the algorithm proposed in this paper will improve reliability by effectively excluding unread cervical imaging and ultimately reducing errors in artificial intelligence diagnosis.

Improving Adversarial Robustness via Attention (Attention 기법에 기반한 적대적 공격의 강건성 향상 연구)

  • Jaeuk Kim;Myung Gyo Oh;Leo Hyun Park;Taekyoung Kwon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.4
    • /
    • pp.621-631
    • /
    • 2023
  • Adversarial training improves the robustness of deep neural networks for adversarial examples. However, the previous adversarial training method focuses only on the adversarial loss function, ignoring that even a small perturbation of the input layer causes a significant change in the hidden layer features. Consequently, the accuracy of a defended model is reduced for various untrained situations such as clean samples or other attack techniques. Therefore, an architectural perspective is necessary to improve feature representation power to solve this problem. In this paper, we apply an attention module that generates an attention map of an input image to a general model and performs PGD adversarial training upon the augmented model. In our experiments on the CIFAR-10 dataset, the attention augmented model showed higher accuracy than the general model regardless of the network structure. In particular, the robust accuracy of our approach was consistently higher for various attacks such as PGD, FGSM, and BIM and more powerful adversaries. By visualizing the attention map, we further confirmed that the attention module extracts features of the correct class even for adversarial examples.

A loop closing scheme using UWB based indoor positioning technique (UWB 기반 실내 측위 기술을 활용한 루프 클로징 기법)

  • Hyunwoo You;Jungkyun Lee;Somi Nam;Juyeon Lee;Yoonseo Lee;Minsung Kim;Hong Min
    • Smart Media Journal
    • /
    • v.12 no.4
    • /
    • pp.41-46
    • /
    • 2023
  • UWB is a type of technology used for indoor positioning and is characterized by higher accuracy than RSSI-based schemes. Mobile equipment operating based on ROS can monitor the environment around the equipment using lidar and cameras. When applying the loop closing technique to determine the starting position in this monitoring process, the existing method has a problem of low accuracy because the closing operation occurs only when there are feature points on the image. In this paper, to solve this problem, we designed a system that increases the accuracy of loop closing work by providing location information by mounting a UWB tag on a mobile device. In addition, the accuracy of the UWB-based indoor positioning system was evaluated through experiments, and it was verified that it could be used for loop closing techniques.

Efficient Thread Allocation Method of Convolutional Neural Network based on GPGPU (GPGPU 기반 Convolutional Neural Network의 효율적인 스레드 할당 기법)

  • Kim, Mincheol;Lee, Kwangyeob
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.10
    • /
    • pp.935-943
    • /
    • 2017
  • CNN (Convolution neural network), which is used for image classification and speech recognition among neural networks learning based on positive data, has been continuously developed to have a high performance structure to date. There are many difficulties to utilize in an embedded system with limited resources. Therefore, we use GPU (General-Purpose Computing on Graphics Processing Units), which is used for general-purpose operation of GPU to solve the problem because we use pre-learned weights but there are still limitations. Since CNN performs simple and iterative operations, the computation speed varies greatly depending on the thread allocation and utilization method in the Single Instruction Multiple Thread (SIMT) based GPGPU. To solve this problem, there is a thread that needs to be relaxed when performing Convolution and Pooling operations with threads. The remaining threads have increased the operation speed by using the method used in the following feature maps and kernel calculations.

Transfer Learning-Based Vibration Fault Diagnosis for Ball Bearing (전이학습을 이용한 볼베어링의 진동진단)

  • Subin Hong;Youngdae Lee;Chanwoo Moon
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.3
    • /
    • pp.845-850
    • /
    • 2023
  • In this paper, we propose a method for diagnosing ball bearing vibration using transfer learning. STFT, which can analyze vibration signals in time-frequency, was used as input to CNN to diagnose failures. In order to rapidly learn CNN-based deep artificial neural networks and improve diagnostic performance, we proposed a transfer learning-based deep learning learning technique. For transfer learning, the feature extractor and classifier were selectively learned using a VGG-based image classification model, the data set for learning was publicly available ball bearing vibration data provided by Case Western Reserve University, and performance was evaluated by comparing the proposed method with the existing CNN model. Experimental results not only prove that transfer learning is useful for condition diagnosis in ball bearing vibration data, but also allow other industries to use transfer learning to improve condition diagnosis.

Improving Field Crop Classification Accuracy Using GLCM and SVM with UAV-Acquired Images

  • Seung-Hwan Go;Jong-Hwa Park
    • Korean Journal of Remote Sensing
    • /
    • v.40 no.1
    • /
    • pp.93-101
    • /
    • 2024
  • Accurate field crop classification is essential for various agricultural applications, yet existing methods face challenges due to diverse crop types and complex field conditions. This study aimed to address these issues by combining support vector machine (SVM) models with multi-seasonal unmanned aerial vehicle (UAV) images, texture information extracted from Gray Level Co-occurrence Matrix (GLCM), and RGB spectral data. Twelve high-resolution UAV image captures spanned March-October 2021, while field surveys on three dates provided ground truth data. We focused on data from August (-A), September (-S), and October (-O) images and trained four support vector classifier (SVC) models (SVC-A, SVC-S, SVC-O, SVC-AS) using visual bands and eight GLCM features. Farm maps provided by the Ministry of Agriculture, Food and Rural Affairs proved efficient for open-field crop identification and served as a reference for accuracy comparison. Our analysis showcased the significant impact of hyperparameter tuning (C and gamma) on SVM model performance, requiring careful optimization for each scenario. Importantly, we identified models exhibiting distinct high-accuracy zones, with SVC-O trained on October data achieving the highest overall and individual crop classification accuracy. This success likely stems from its ability to capture distinct texture information from mature crops.Incorporating GLCM features proved highly effective for all models,significantly boosting classification accuracy.Among these features, homogeneity, entropy, and correlation consistently demonstrated the most impactful contribution. However, balancing accuracy with computational efficiency and feature selection remains crucial for practical application. Performance analysis revealed that SVC-O achieved exceptional results in overall and individual crop classification, while soybeans and rice were consistently classified well by all models. Challenges were encountered with cabbage due to its early growth stage and low field cover density. The study demonstrates the potential of utilizing farm maps and GLCM features in conjunction with SVM models for accurate field crop classification. Careful parameter tuning and model selection based on specific scenarios are key for optimizing performance in real-world applications.

A deep and multiscale network for pavement crack detection based on function-specific modules

  • Guolong Wang;Kelvin C.P. Wang;Allen A. Zhang;Guangwei Yang
    • Smart Structures and Systems
    • /
    • v.32 no.3
    • /
    • pp.135-151
    • /
    • 2023
  • Using 3D asphalt pavement surface data, a deep and multiscale network named CrackNet-M is proposed in this paper for pixel-level crack detection for improvements in both accuracy and robustness. The CrackNet-M consists of four function-specific architectural modules: a central branch net (CBN), a crack map enhancement (CME) module, three pooling feature pyramids (PFP), and an output layer. The CBN maintains crack boundaries using no pooling reductions throughout all convolutional layers. The CME applies a pooling layer to enhance potential thin cracks for better continuity, consuming no data loss and attenuation when working jointly with CBN. The PFP modules implement direct down-sampling and pyramidal up-sampling with multiscale contexts specifically for the detection of thick cracks and exclusion of non-crack patterns. Finally, the output layer is optimized with a skip layer supervision technique proposed to further improve the network performance. Compared with traditional supervisions, the skip layer supervision brings about not only significant performance gains with respect to both accuracy and robustness but a faster convergence rate. CrackNet-M was trained on a total of 2,500 pixel-wise annotated 3D pavement images and finely scaled with another 200 images with full considerations on accuracy and efficiency. CrackNet-M can potentially achieve crack detection in real-time with a processing speed of 40 ms/image. The experimental results on 500 testing images demonstrate that CrackNet-M can effectively detect both thick and thin cracks from various pavement surfaces with a high level of Precision (94.28%), Recall (93.89%), and F-measure (94.04%). In addition, the proposed CrackNet-M compares favorably to other well-developed networks with respect to the detection of thin cracks as well as the removal of shoulder drop-offs.