• Title/Summary/Keyword: Preprocessing method

Search Result 1,081, Processing Time 0.028 seconds

Transformer Network for Container's BIC-code Recognition (컨테이너 BIC-code 인식을 위한 Transformer Network)

  • Kwon, HeeJoo;Kang, HyunSoo
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.1
    • /
    • pp.19-26
    • /
    • 2022
  • This paper presents a pre-processing method to facilitate the container's BIC-code recognition. We propose a network that can find ROI(Region Of Interests) containing a BIC-code region and estimate a homography matrix for warping. Taking the structure of STN(Spatial Transformer Networks), the proposed network consists of next 3 steps, ROI detection, homography matrix estimation, and warping using the homography estimated in the previous step. It contributes to improving the accuracy of BIC-code recognition by estimating ROI and matrix using the proposed network and correcting perspective distortion of ROI using the estimated matrix. For performance evaluation, five evaluators evaluated the output image as a perfect score of 5 and received an average of 4.25 points, and when visually checked, 224 out of 312 photos are accurately and perfectly corrected, containing ROI.

A Tuberculosis Detection Method Using Attention and Sparse R-CNN

  • Xu, Xuebin;Zhang, Jiada;Cheng, Xiaorui;Lu, Longbin;Zhao, Yuqing;Xu, Zongyu;Gu, Zhuangzhuang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.7
    • /
    • pp.2131-2153
    • /
    • 2022
  • To achieve accurate detection of tuberculosis (TB) areas in chest radiographs, we design a chest X-ray TB area detection algorithm. The algorithm consists of two stages: the chest X-ray TB classification network (CXTCNet) and the chest X-ray TB area detection network (CXTDNet). CXTCNet is used to judge the presence or absence of TB areas in chest X-ray images, thereby excluding the influence of other lung diseases on the detection of TB areas. It can reduce false positives in the detection network and improve the accuracy of detection results. In CXTCNet, we propose a channel attention mechanism (CAM) module and combine it with DenseNet. This module enables the network to learn more spatial and channel features information about chest X-ray images, thereby improving network performance. CXTDNet is a design based on a sparse object detection algorithm (Sparse R-CNN). A group of fixed learnable proposal boxes and learnable proposal features are using for classification and location. The predictions of the algorithm are output directly without non-maximal suppression post-processing. Furthermore, we use CLAHE to reduce image noise and improve image quality for data preprocessing. Experiments on dataset TBX11K show that the accuracy of the proposed CXTCNet is up to 99.10%, which is better than most current TB classification algorithms. Finally, our proposed chest X-ray TB detection algorithm could achieve AP of 45.35% and AP50 of 74.20%. We also establish a chest X-ray TB dataset with 304 sheets. And experiments on this dataset showed that the accuracy of the diagnosis was comparable to that of radiologists. We hope that our proposed algorithm and established dataset will advance the field of TB detection.

Construction of a Spatio-Temporal Dataset for Deep Learning-Based Precipitation Nowcasting

  • Kim, Wonsu;Jang, Dongmin;Park, Sung Won;Yang, MyungSeok
    • Journal of Information Science Theory and Practice
    • /
    • v.10 no.spc
    • /
    • pp.135-142
    • /
    • 2022
  • Recently, with the development of data processing technology and the increase of computational power, methods to solving social problems using Artificial Intelligence (AI) are in the spotlight, and AI technologies are replacing and supplementing existing traditional methods in various fields. Meanwhile in Korea, heavy rain is one of the representative factors of natural disasters that cause enormous economic damage and casualties every year. Accurate prediction of heavy rainfall over the Korean peninsula is very difficult due to its geographical features, located between the Eurasian continent and the Pacific Ocean at mid-latitude, and the influence of the summer monsoon. In order to deal with such problems, the Korea Meteorological Administration operates various state-of-the-art observation equipment and a newly developed global atmospheric model system. Nevertheless, for precipitation nowcasting, the use of a separate system based on the extrapolation method is required due to the intrinsic characteristics associated with the operation of numerical weather prediction models. The predictability of existing precipitation nowcasting is reliable in the early stage of forecasting but decreases sharply as forecast lead time increases. At this point, AI technologies to deal with spatio-temporal features of data are expected to greatly contribute to overcoming the limitations of existing precipitation nowcasting systems. Thus, in this project the dataset required to develop, train, and verify deep learning-based precipitation nowcasting models has been constructed in a regularized form. The dataset not only provides various variables obtained from multiple sources, but also coincides with each other in spatio-temporal specifications.

Vibration Data Denoising and Performance Comparison Using Denoising Auto Encoder Method (Denoising Auto Encoder 기법을 활용한 진동 데이터 전처리 및 성능비교)

  • Jang, Jun-gyo;Noh, Chun-myoung;Kim, Sung-soo;Lee, Soon-sup;Lee, Jae-chul
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.27 no.7
    • /
    • pp.1088-1097
    • /
    • 2021
  • Vibration data of mechanical equipment inevitably have noise. This noise adversely af ects the maintenance of mechanical equipment. Accordingly, the performance of a learning model depends on how effectively the noise of the data is removed. In this study, the noise of the data was removed using the Denoising Auto Encoder (DAE) technique which does not include the characteristic extraction process in preprocessing time series data. In addition, the performance was compared with that of the Wavelet Transform, which is widely used for machine signal processing. The performance comparison was conducted by calculating the failure detection rate. For a more accurate comparison, a classification performance evaluation criterion, the F-1 Score, was calculated. Failure data were detected using the One-Class SVM technique. The performance comparison, revealed that the DAE technique performed better than the Wavelet Transform technique in terms of failure diagnosis and error rate.

Switching Filter based on Noise Estimation in Random Value Impulse Noise Environments (랜덤 임펄스 잡음 환경에서 잡음추정에 기반한 스위칭 필터)

  • Bong-Won, Cheon;Nam-Ho, Kim
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.27 no.1
    • /
    • pp.54-61
    • /
    • 2023
  • With the development of IoT technologies and artificial intelligent, diverse digital image equipments are being used in industrial sites. Because image data can be easily damaged by noise while it's obtained with a camera or a sensor and the damaged image has a bad effect on the process of image processing, noise removal is being demanded as preprocessing. In this thesis, for the restoration of image damaged by the noise of random impulse, a switching filter algorithm based on noise estimation was suggested. With the proposed algorithm, noise estimation and error distraction were carried out according to the similarity of the pixel values in the local mask of the image, and a filter was chosen and switched depending on the ratio of noise existing in the local mask. Simulations were conducted to analyze the noise removal performance of the proposed algorithm, and as a result of magnified image and PSNR comparison, it showed superior performance compared to the existing method.

Bioimage Analyses Using Artificial Intelligence and Future Ecological Research and Education Prospects: A Case Study of the Cichlid Fishes from Lake Malawi Using Deep Learning

  • Joo, Deokjin;You, Jungmin;Won, Yong-Jin
    • Proceedings of the National Institute of Ecology of the Republic of Korea
    • /
    • v.3 no.2
    • /
    • pp.67-72
    • /
    • 2022
  • Ecological research relies on the interpretation of large amounts of visual data obtained from extensive wildlife surveys, but such large-scale image interpretation is costly and time-consuming. Using an artificial intelligence (AI) machine learning model, especially convolution neural networks (CNN), it is possible to streamline these manual tasks on image information and to protect wildlife and record and predict behavior. Ecological research using deep-learning-based object recognition technology includes various research purposes such as identifying, detecting, and identifying species of wild animals, and identification of the location of poachers in real-time. These advances in the application of AI technology can enable efficient management of endangered wildlife, animal detection in various environments, and real-time analysis of image information collected by unmanned aerial vehicles. Furthermore, the need for school education and social use on biodiversity and environmental issues using AI is raised. School education and citizen science related to ecological activities using AI technology can enhance environmental awareness, and strengthen more knowledge and problem-solving skills in science and research processes. Under these prospects, in this paper, we compare the results of our early 2013 study, which automatically identified African cichlid fish species using photographic data of them, with the results of reanalysis by CNN deep learning method. By using PyTorch and PyTorch Lightning frameworks, we achieve an accuracy of 82.54% and an F1-score of 0.77 with minimal programming and data preprocessing effort. This is a significant improvement over the previous our machine learning methods, which required heavy feature engineering costs and had 78% accuracy.

KorPatELECTRA : A Pre-trained Language Model for Korean Patent Literature to improve performance in the field of natural language processing(Korean Patent ELECTRA)

  • Jang, Ji-Mo;Min, Jae-Ok;Noh, Han-Sung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.2
    • /
    • pp.15-23
    • /
    • 2022
  • In the field of patents, as NLP(Natural Language Processing) is a challenging task due to the linguistic specificity of patent literature, there is an urgent need to research a language model optimized for Korean patent literature. Recently, in the field of NLP, there have been continuous attempts to establish a pre-trained language model for specific domains to improve performance in various tasks of related fields. Among them, ELECTRA is a pre-trained language model by Google using a new method called RTD(Replaced Token Detection), after BERT, for increasing training efficiency. The purpose of this paper is to propose KorPatELECTRA pre-trained on a large amount of Korean patent literature data. In addition, optimal pre-training was conducted by preprocessing the training corpus according to the characteristics of the patent literature and applying patent vocabulary and tokenizer. In order to confirm the performance, KorPatELECTRA was tested for NER(Named Entity Recognition), MRC(Machine Reading Comprehension), and patent classification tasks using actual patent data, and the most excellent performance was verified in all the three tasks compared to comparative general-purpose language models.

Utility Analysis of Federated Learning Techniques through Comparison of Financial Data Performance (금융데이터의 성능 비교를 통한 연합학습 기법의 효용성 분석)

  • Jang, Jinhyeok;An, Yoonsoo;Choi, Daeseon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.2
    • /
    • pp.405-416
    • /
    • 2022
  • Current AI technology is improving the quality of life by using machine learning based on data. When using machine learning, transmitting distributed data and collecting it in one place goes through a de-identification process because there is a risk of privacy infringement. De-identification data causes information damage and omission, which degrades the performance of the machine learning process and complicates the preprocessing process. Accordingly, Google announced joint learning in 2016, a method of de-identifying data and learning without the process of collecting data into one server. This paper analyzed the effectiveness by comparing the difference between the learning performance of data that went through the de-identification process of K anonymity and differential privacy reproduction data using actual financial data. As a result of the experiment, the accuracy of original data learning was 79% for k=2, 76% for k=5, 52% for k=7, 50% for 𝜖=1, and 82% for 𝜖=0.1, and 86% for Federated learning.

An effective automated ontology construction based on the agriculture domain

  • Deepa, Rajendran;Vigneshwari, Srinivasan
    • ETRI Journal
    • /
    • v.44 no.4
    • /
    • pp.573-587
    • /
    • 2022
  • The agricultural sector is completely different from other sectors since it completely relies on various natural and climatic factors. Climate changes have many effects, including lack of annual rainfall and pests, heat waves, changes in sea level, and global ozone/atmospheric CO2 fluctuation, on land and agriculture in similar ways. Climate change also affects the environment. Based on these factors, farmers chose their crops to increase productivity in their fields. Many existing agricultural ontologies are either domain-specific or have been created with minimal vocabulary and no proper evaluation framework has been implemented. A new agricultural ontology focused on subdomains is designed to assist farmers using Jaccard relative extractor (JRE) and Naïve Bayes algorithm. The JRE is used to find the similarity between two sentences and words in the agricultural documents and the relationship between two terms is identified via the Naïve Bayes algorithm. In the proposed method, the preprocessing of data is carried out through natural language processing techniques and the tags whose dimensions are reduced are subjected to rule-based formal concept analysis and mapping. The subdomain ontologies of weather, pest, and soil are built separately, and the overall agricultural ontology are built around them. The gold standard for the lexical layer is used to evaluate the proposed technique, and its performance is analyzed by comparing it with different state-of-the-art systems. Precision, recall, F-measure, Matthews correlation coefficient, receiver operating characteristic curve area, and precision-recall curve area are the performance metrics used to analyze the performance. The proposed methodology gives a precision score of 94.40% when compared with the decision tree(83.94%) and K-nearest neighbor algorithm(86.89%) for agricultural ontology construction.

Development of Water Velocity Data Preprocessing Method for PAVOs (PAVOs 활용을 위한 유속데이터 전처리 기법 개발)

  • Soyeon Lim;Youngmoo Yu;Sinjae Lee;Yeongil Lee
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.85-85
    • /
    • 2023
  • 유량 측정을 위해 도섭법, 횡측선법 등의 인력에 의한 방법이 적용되고 있으나, 이는 야간 및 휴일 측정, 인력 부족 등 여러 제약으로 인해 고수위 홍수를 측정하는 데에 한계가 있다. 이를 해결하기 위해 시공간적 제약이 없는 도플러 방식 초음파유속계(Acousitc Doppler Velocity Meter, ADVM)와 자동유속관측시스템(Portable Automatic Velocity Observation System; PAVOs)이 제안되었다. 이 방법들은 교량에 설치된 장치를 통해 실시간으로 유속이 계측되어 시공간적 제약이 없으며 홍수 관리에 유용하게 사용될 수 있다. 실시간으로 계측된 유속 데이터는 오·결측 값이 발생하며 ADVM의 경우 수위-유량관계식을 활용하는 등 전처리 방법이 활용되고 있지만 전자파표면유속계를 활용한 PAVOs 데이터의 전처리 방법에 대한 연구는 부족하다. 따라서 본 연구에서는 PAVOs에서 실시간으로 계측된 유속 데이터의 전 처리 과정(Pre-processing)을 개발하였다. PAVOs를 통해 측정된 데이터는 5분 단위로 10개의 유속이 한번에 측정되며 비정상성(Non-stationary)인 특징을 가진다. 이 데이터의 전처리 과정으로 오·결측값에 대한 처리 및 보간법 적용 이후 10개 값 중 실제 유속을 판단하고 잡음제거(Denoising)를 수행하였다. 이를 강원도 홍천강에 위치한 홍천교에서 계측된 유속 데이터에 적용하였다. 그 결과 데이터의 상승부와 하강부에서 일정한 경향성을 파악할 수 있다. 이 데이터를 통해 산정한 유량과 실측 기반의 평균유속과 관계를 통해 계산한 유량을 비교해 보았을 때 낮은 편차율을 가지는 것을 확인하였다. 전 처리 된 실시간 유속 데이터를 활용한다면 최고수위가 발생하였을 경우 홍수량을 산정할 수 있을 것이다. 또한, 강우 또는 하천 공사에 의해 변동하는 수위-유량관계곡선식을 실시간으로 개발할 수 있을 것이며 이는 효과적인 홍수관리에 큰 역할을 할 수 있을 것이다.

  • PDF