• Title/Summary/Keyword: dataset

Search Result 4,026, Processing Time 0.029 seconds

Normal data based rotating machine anomaly detection using CNN with self-labeling

  • Bae, Jaewoong;Jung, Wonho;Park, Yong-Hwa
    • Smart Structures and Systems
    • /
    • v.29 no.6
    • /
    • pp.757-766
    • /
    • 2022
  • To train deep learning algorithms, a sufficient number of data are required. However, in most engineering systems, the acquisition of fault data is difficult or sometimes not feasible, while normal data are secured. The dearth of data is one of the major challenges to developing deep learning models, and fault diagnosis in particular cannot be made in the absence of fault data. With this context, this paper proposes an anomaly detection methodology for rotating machines using only normal data with self-labeling. Since only normal data are used for anomaly detection, a self-labeling method is used to generate a new labeled dataset. The overall procedure includes the following three steps: (1) transformation of normal data to self-labeled data based on a pretext task, (2) training the convolutional neural networks (CNN), and (3) anomaly detection using defined anomaly score based on the softmax output of the trained CNN. The softmax value of the abnormal sample shows different behavior from the normal softmax values. To verify the proposed method, four case studies were conducted, on the Case Western Reserve University (CWRU) bearing dataset, IEEE PHM 2012 data challenge dataset, PHMAP 2021 data challenge dataset, and laboratory bearing testbed; and the results were compared to those of existing machine learning and deep learning methods. The results showed that the proposed algorithm could detect faults in the bearing testbed and compressor with over 99.7% accuracy. In particular, it was possible to detect not only bearing faults but also structural faults such as unbalance and belt looseness with very high accuracy. Compared with the existing GAN, the autoencoder-based anomaly detection algorithm, the proposed method showed high anomaly detection performance.

Construction of a Spatio-Temporal Dataset for Deep Learning-Based Precipitation Nowcasting

  • Kim, Wonsu;Jang, Dongmin;Park, Sung Won;Yang, MyungSeok
    • Journal of Information Science Theory and Practice
    • /
    • v.10 no.spc
    • /
    • pp.135-142
    • /
    • 2022
  • Recently, with the development of data processing technology and the increase of computational power, methods to solving social problems using Artificial Intelligence (AI) are in the spotlight, and AI technologies are replacing and supplementing existing traditional methods in various fields. Meanwhile in Korea, heavy rain is one of the representative factors of natural disasters that cause enormous economic damage and casualties every year. Accurate prediction of heavy rainfall over the Korean peninsula is very difficult due to its geographical features, located between the Eurasian continent and the Pacific Ocean at mid-latitude, and the influence of the summer monsoon. In order to deal with such problems, the Korea Meteorological Administration operates various state-of-the-art observation equipment and a newly developed global atmospheric model system. Nevertheless, for precipitation nowcasting, the use of a separate system based on the extrapolation method is required due to the intrinsic characteristics associated with the operation of numerical weather prediction models. The predictability of existing precipitation nowcasting is reliable in the early stage of forecasting but decreases sharply as forecast lead time increases. At this point, AI technologies to deal with spatio-temporal features of data are expected to greatly contribute to overcoming the limitations of existing precipitation nowcasting systems. Thus, in this project the dataset required to develop, train, and verify deep learning-based precipitation nowcasting models has been constructed in a regularized form. The dataset not only provides various variables obtained from multiple sources, but also coincides with each other in spatio-temporal specifications.

Small-Scale Object Detection Label Reassignment Strategy

  • An, Jung-In;Kim, Yoon;Choi, Hyun-Soo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.12
    • /
    • pp.77-84
    • /
    • 2022
  • In this paper, we propose a Label Reassignment Strategy to improve the performance of an object detection algorithm. Our approach involves two stages: an inference stage and an assignment stage. In the inference stage, we perform multi-scale inference with predefined scale sizes on a trained model and re-infer masked images to obtain robust classification results. In the assignment stage, we calculate the IoU between bounding boxes to remove duplicates. We also check box and class occurrence between the detection result and annotation label to re-assign the dominant class type. We trained the YOLOX-L model with the re-annotated dataset to validate our strategy. The model achieved a 3.9% improvement in mAP and 3x better performance on AP_S compared to the model trained with the original dataset. Our results demonstrate that the proposed Label Reassignment Strategy can effectively improve the performance of an object detection model.

Boosting the Performance of the Predictive Model on the Imbalanced Dataset Using SVM Based Bagging and Out-of-Distribution Detection (SVM 기반 Bagging과 OoD 탐색을 활용한 제조공정의 불균형 Dataset에 대한 예측모델의 성능향상)

  • Kim, Jong Hoon;Oh, Hayoung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.11
    • /
    • pp.455-464
    • /
    • 2022
  • There are two unique characteristics of the datasets from a manufacturing process. They are the severe class imbalance and lots of Out-of-Distribution samples. Some good strategies such as the oversampling over the minority class, and the down-sampling over the majority class, are well known to handle the class imbalance. In addition, SMOTE has been chosen to address the issue recently. But, Out-of-Distribution samples have been studied just with neural networks. It seems to be hardly shown that Out-of-Distribution detection is applied to the predictive model using conventional machine learning algorithms such as SVM, Random Forest and KNN. It is known that conventional machine learning algorithms are much better than neural networks in prediction performance, because neural networks are vulnerable to over-fitting and requires much bigger dataset than conventional machine learning algorithms does. So, we suggests a new approach to utilize Out-of-Distribution detection based on SVM algorithm. In addition to that, bagging technique will be adopted to improve the precision of the model.

Unsupervised Abstractive Summarization Method that Suitable for Documents with Flows (흐름이 있는 문서에 적합한 비지도학습 추상 요약 방법)

  • Lee, Hoon-suk;An, Soon-hong;Kim, Seung-hoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.11
    • /
    • pp.501-512
    • /
    • 2021
  • Recently, a breakthrough has been made in the NLP area by Transformer techniques based on encoder-decoder. However, this only can be used in mainstream languages where millions of dataset are well-equipped, such as English and Chinese, and there is a limitation that it cannot be used in non-mainstream languages where dataset are not established. In addition, there is a deflection problem that focuses on the beginning of the document in mechanical summarization. Therefore, these methods are not suitable for documents with flows such as fairy tales and novels. In this paper, we propose a hybrid summarization method that does not require a dataset and improves the deflection problem using GAN with two adaptive discriminators. We evaluate our model on the CNN/Daily Mail dataset to verify an objective validity. Also, we proved that the model has valid performance in Korean, one of the non-mainstream languages.

Microcode based Controller for Compact CNN Accelerators Aimed at Mobile Devices (모바일 디바이스를 위한 소형 CNN 가속기의 마이크로코드 기반 컨트롤러)

  • Na, Yong-Seok;Son, Hyun-Wook;Kim, Hyung-Won
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.3
    • /
    • pp.355-366
    • /
    • 2022
  • This paper proposes a microcode-based neural network accelerator controller for artificial intelligence accelerators that can be reconstructed using a programmable architecture and provide the advantages of low-power and ultra-small chip size. In order for the target accelerator to support various neural network models, the neural network model can be converted into microcode through microcode compiler and mounted on accelerator to control the operators of the accelerator such as datapath and memory access. While the proposed controller and accelerator can run various CNN models, in this paper, we tested them using the YOLOv2-Tiny CNN model. Using a system clock of 200 MHz, the Controller and accelerator achieved an inference time of 137.9 ms/image for VOC 2012 dataset to detect object, 99.5ms/image for mask detection dataset to detect wearing mask. When implementing an accelerator equipped with the proposed controller as a silicon chip, the gate count is 618,388, which corresponds to 65.5% reduction in chip area compared with an accelerator employing a CPU-based controller (RISC-V).

Fault Detection in Diecasting Process Based on Deep-Learning (다단계 딥러닝 기반 다이캐스팅 공정 불량 검출)

  • Jeongsu Lee;Youngsim, Choi
    • Journal of Korea Foundry Society
    • /
    • v.42 no.6
    • /
    • pp.369-376
    • /
    • 2022
  • The die-casting process is an important process for various industries, but there are limitations in the profitability and productivity of related companies due to the high defect rate. In order to overcome this, this study has developed die-casting fault detection modules based on industrial AI technologies. The developed module is constructed from three-stage models depending on the characteristics of the dataset. The first-stage model conducts fault detection based on supervised learning from the dataset without labels. The second-stage model realizes one-class classification based on semi-supervised learning, where the dataset only has production success labels. The third-stage model corresponds to fault detection based on supervised learning, where the dataset includes a small amount of production failure cases. The developed fault detection module exhibited outstanding performance with roughly 96% accuracy for actual process data.

A Study on Insider Threat Dataset Sharing Using Blockchain (블록체인을 활용한 내부자 유출위협 데이터 공유 연구)

  • Wonseok Yoon;Hangbae Chang
    • Journal of Platform Technology
    • /
    • v.11 no.2
    • /
    • pp.15-25
    • /
    • 2023
  • This study analyzes the limitations of the insider threat datasets used for insider threat detection research and compares and analyzes the solution-based insider threat data with public insider threat data using a security solution to overcome this. Through this, we design a data format suitable for insider threat detection and implement a system that can safely share insider threat information between different institutions and companies using blockchain technology. Currently, there is no dataset collected based on actual events in the insider threat dataset that is revealed to researchers. Public datasets are virtual synthetic data randomly created for research, and when used as a learning model, there are many limitations in the real environment. In this study, to improve these limitations, a private blockchain was designed to secure information sharing between institutions of different affiliations, and a method was derived to increase reliability and maintain information integrity and consistency through agreement and verification among participants. The proposed method is expected to collect data through an outflow threat collector and collect quality data sets that posed a threat, not synthetic data, through a blockchain-based sharing system, to solve the current outflow threat dataset problem and contribute to the insider threat detection model in the future.

  • PDF

Comparison of image quality according to activation function during Super Resolution using ESCPN (ESCPN을 이용한 초해상화 시 활성화 함수에 따른 이미지 품질의 비교)

  • Song, Moon-Hyuk;Song, Ju-Myung;Hong, Yeon-Jo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.129-132
    • /
    • 2022
  • Super-resolution is the process of converting a low-quality image into a high-quality image. This study was conducted using ESPCN. In a super-resolution deep neural network, different quality images can be output even when receiving the same input data according to the activation function that determines the weight when passing through each node. Therefore, the purpose of this study is to find the most suitable activation function for super-resolution by applying the activation functions ReLU, ELU, and Swish and compare the quality of the output image for the same input images. The CelebaA Dataset was used as the dataset. Images were cut into a square during the pre-processing process then the image quality was lowered. The degraded image was used as the input image and the original image was used for evaluation. As a result, ELU and swish took a long time to train compared to ReLU, which is mainly used for machine learning but showed better performance.

  • PDF

Avocado Classification and Shipping Prediction System based on Transfer Learning Model for Rational Pricing (합리적 가격결정을 위한 전이학습모델기반 아보카도 분류 및 출하 예측 시스템)

  • Seong-Un Yu;Seung-Min Park
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.2
    • /
    • pp.329-335
    • /
    • 2023
  • Avocado, a superfood selected by Time magazine and one of the late ripening fruits, is one of the foods with a big difference between local prices and domestic distribution prices. If this sorting process of avocados is automated, it will be possible to lower prices by reducing labor costs in various fields. In this paper, we aim to create an optimal classification model by creating an avocado dataset through crawling and using a number of deep learning-based transfer learning models. Experiments were conducted by directly substituting a deep learning-based transfer learning model from a dataset separated from the produced dataset and fine-tuning the hyperparameters of the model. When an avocado image is input, the model classifies the ripeness of the avocado with an accuracy of over 99%, and proposes a dataset and algorithm that can reduce manpower and increase accuracy in avocado production and distribution households.