• Title/Summary/Keyword: data pre-processing

Search Result 809, Processing Time 0.046 seconds

A Study of Automatic Deep Learning Data Generation by Considering Private Information Protection (개인정보 보호를 고려한 딥러닝 데이터 자동 생성 방안 연구)

  • Sung-Bong Jang
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.1
    • /
    • pp.435-441
    • /
    • 2024
  • In order for the large amount of collected data sets to be used as deep learning training data, sensitive personal information such as resident registration number and disease information must be changed or encrypted to prevent it from being exposed to hackers, and the data must be reconstructed to match the structure of the built deep learning model. Currently, these tasks are performed manually by experts, which takes a lot of time and money. To solve these problems, this paper proposes a technique that can automatically perform data processing tasks to protect personal information during the deep learning process. In the proposed technique, privacy protection tasks are performed based on data generalization and data reconstruction tasks are performed using circular queues. To verify the validity of the proposed technique, it was directly implemented using C language. As a result of the verification, it was confirmed that data generalization was performed normally and data reconstruction suitable for the deep learning model was performed properly.

Convolutional neural network-based data anomaly detection considering class imbalance with limited data

  • Du, Yao;Li, Ling-fang;Hou, Rong-rong;Wang, Xiao-you;Tian, Wei;Xia, Yong
    • Smart Structures and Systems
    • /
    • v.29 no.1
    • /
    • pp.63-75
    • /
    • 2022
  • The raw data collected by structural health monitoring (SHM) systems may suffer multiple patterns of anomalies, which pose a significant barrier for an automatic and accurate structural condition assessment. Therefore, the detection and classification of these anomalies is an essential pre-processing step for SHM systems. However, the heterogeneous data patterns, scarce anomalous samples and severe class imbalance make data anomaly detection difficult. In this regard, this study proposes a convolutional neural network-based data anomaly detection method. The time and frequency domains data are transferred as images and used as the input of the neural network for training. ResNet18 is adopted as the feature extractor to avoid training with massive labelled data. In addition, the focal loss function is adopted to soften the class imbalance-induced classification bias. The effectiveness of the proposed method is validated using acceleration data collected in a long-span cable-stayed bridge. The proposed approach detects and classifies data anomalies with high accuracy.

Design of Mobile Agent-based Software Module For Reducing Load of RFID Middleware (RFID 미들웨어 부하를 줄이기 위한 이동 에이전트 기반 소프트웨어 모듈의 설계)

  • Ahn, Yong-Sun;Ahn, Jin-Ho
    • Journal of Internet Computing and Services
    • /
    • v.10 no.3
    • /
    • pp.95-101
    • /
    • 2009
  • As RFID technology has been developed rapidly, its technical potential has it be widely used in many industrial fields. Particularly, in the physical distribution industry, the introduction of RFID has enormously contributed to effectively monitoring locations and information of products in real-time. Also, a significant decline in tag prices and RFID related technical competitiveness enabled each tag to be managed much more minutely by attaching it to an item, not a pallet nor a container. However, if a very large volume of tag data are continuously flowed into a RFID middleware with limited hardware resources, its entire data processing time may become considerably longer. Therefore, specific technologies are in great demand to handle and further to reduce the load of the middleware. In this paper, we proposed a mobile agent-based software module to efficiently reduce the load of the middleware by pre-processing a lot of tag data while items are in transit. Simulation results show that using the proposed software module considerably enhances the speed of processing tag data than otherwise. This behavior increases the tag recognition rate in a certain time limit and improves reliability of RFID middlewares.

  • PDF

A Study on the Intelligent 3D Foot Scanning System (인공지능형 삼차원 Foot Scanning 시스템에 관한 연구)

  • Kim, Young-Tak;Park, Ju-Won;Tack, Han-Ho;Lee, Sang-Bae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.7
    • /
    • pp.871-877
    • /
    • 2004
  • In this paper, for manufacturing a custom-made shoes, shape of foot acquired three-dimensional measurement device which makes shoe-last data for needing a custom-made shoes is founded on artificial intelligence technique and it shows method restoring to the original shape in optimized state. the developed system for this study is based on PC which uses existing three dimensional measurement method. And it gains shoe-last and data of foot shape going through 8 CCD(Charge Coupled Device) Which equipped top and bottom, right and left sides and 4 lasers which also equipped both sides and upper and lower sides. The acquired data are processed image processing algorithm using artificial intelligence technique. And result of data management is better quality of removing noise than other system not using artificial intelligence technique and it can simplify post-processing. So, this paper is constituted hardware and software system and it used neural network for determining threshold value, when input image on pre-processing step is being stage of image binarization and present that results.

Development of Education Programs for Sports Clubs using Sports Data (운동부를 위한 스포츠 데이터 활용 교육 프로그램 개발)

  • Kim, Semin;Woo, SungHee
    • Journal of Practical Engineering Education
    • /
    • v.13 no.3
    • /
    • pp.435-442
    • /
    • 2021
  • In this study, a program was developed to educate the students and athletes of the school sports team on the overall knowledge of using sports data. Accordingly, existing research and requirements for using sports data were analyzed, a learning plan was designed, and an education program was developed in a step-by-step manner according to the educational requirements. In addition, as there is no research yet on data science education for school athletics and adult sports officials in existing studies, this study includes the problem definition, data collection, data pre-processing, and data analysis, as well as the additional stages of data visualization and simulation analysis. It is expected that the sports industry's interest in sports data will increase through this study.

Hyperspectral Remote Sensing for Agriculture in Support of GIS Data

  • Zhang, Bing;Zhang, Xia;Liu, Liangyun;Miyazaki, Sanae;Kosaka, Naoko;Ren, Fuhu
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.1397-1399
    • /
    • 2003
  • When and Where, What kind of agricultural products will be produced and provided for the market? It is a commercial requirement, and also an academic questions to remote sensing technology. Crop physiology analysis and growth monitoring are important elements for precision agriculture management. Remote sensing technology supplies us more selections and available spaces in this dynamic change study by producing images of different spatial, spectral and temporal resolutions. Especially, the hyperspectral remote sensing should do play a key role in crop growth investigation at national, regional and global scales. In the past five years, Chinese academy of sciences and Japan NTT-DATA have made great efforts to establish a prototype information service system to dynamically survey the vegetable planting situation in Nagano area of Japan mainly based on remote sensing data. For such concern, a flexible and light-duty flight system and some practical data processing system and some necessary background information should be rationally made together. In addition, some studies are also important, such as quick pre-processing for hyperspectral data, Multi-temporal vegetation index analysis, hyperspectral image classification in support of GIS data, etc. In this paper, several spectral data analysis models and a designed airborne platform are provided and discussed here.

  • PDF

Observation of the Earth's Magnetic field from KOMPSAT-1

  • Hwang, Jong-Sun;Kim, Sung-Yong;Lee, Seon-Ho;Min, Kyung-Duck;Kim, Jeong-Woo;Lee, Su-Jin
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.1236-1238
    • /
    • 2003
  • The Earth's total magnetic field was extracted from on board TAM (Three Axis Magnetometer) observations of KOMPSAT-1 satellite between June 19th and 21st, 2000. In the pre-processing, the TAM's telemetry data were transformed from ECI (Earth Centered Inertial frame) to ECEF (Earth Centered Earth Fixed frame) and then to spherical coordination, and self-induced magnetic field by satellite bus itself were removed by using an on-orbit magnetometer data correction method. The 2-D wavenumber correlation filtering and quadrant-swapping method were applied to the pre-processed data in order to eliminate dynamic components and track-line noise, respectively. Then, the spherical harmonic coefficients are calculated from KOMPSAT-1 data. To test the validity of the TAM's geomagnetic field, Danish/NASA/French ${\phi}$rsted satellite's magnetic model and IGRF2000 model were used for statistical comparison. The correlation coefficient between ${\phi}$rsted and TAM is 0.97 and IGRF and TAM is 0.96. It was found that the data from on board magnetometer observations for attitude control of Earth-observing satellites can be used to determinate the Earth's total magnetic field and that they can be efficiently used to upgrade the global geomagnetic field coefficients, such as IGRF by providing new information at various altitudes with better temporal and spatial coverage.

  • PDF

Endpoint Detection Using Both By-product and Etchant Gas in Plasma Etching Process (플라즈마 식각공정 시 By-product와 Etchant gas를 이용한 식각 종료점 검출)

  • Kim, Dong-Il;Park, Young-Kook;Han, Seung-Soo
    • Journal of IKEEE
    • /
    • v.19 no.4
    • /
    • pp.541-547
    • /
    • 2015
  • In current semiconductor manufacturing, as the feature size of integrated circuit (IC) devices continuously shrinks, detecting endpoint in plasma etching process is more difficult than before. For endpoint detection, various kinds of sensors are installed in semiconductor manufacturing equipments, and sensor data are gathered with predefined sampling rate. Generally, detecting endpoint is performed using OES data of by-product. In this study, OES data of both by-product and etchant gas are used to improve reliability of endpoint detection. For the OES data pre-processing, a combination of Signal to Noise Ratio (SNR) and Principal Component Analysis (PCA),are used. Polynomial Regression and Expanded Hidden Markov model (eHMM) technique are applied to pre-processed OES data to detect endpoint.

EDNN based prediction of strength and durability properties of HPC using fibres & copper slag

  • Gupta, Mohit;Raj, Ritu;Sahu, Anil Kumar
    • Advances in concrete construction
    • /
    • v.14 no.3
    • /
    • pp.185-194
    • /
    • 2022
  • For producing cement and concrete, the construction field has been encouraged by the usage of industrial soil waste (or) secondary materials since it decreases the utilization of natural resources. Simultaneously, for ensuring the quality, the analyses of the strength along with durability properties of that sort of cement and concrete are required. The prediction of strength along with other properties of High-Performance Concrete (HPC) by optimization and machine learning algorithms are focused by already available research methods. However, an error and accuracy issue are possessed. Therefore, the Enhanced Deep Neural Network (EDNN) based strength along with durability prediction of HPC was utilized by this research method. Initially, the data is gathered in the proposed work. Then, the data's pre-processing is done by the elimination of missing data along with normalization. Next, from the pre-processed data, the features are extracted. Hence, the data input to the EDNN algorithm which predicts the strength along with durability properties of the specific mixing input designs. Using the Switched Multi-Objective Jellyfish Optimization (SMOJO) algorithm, the weight value is initialized in the EDNN. The Gaussian radial function is utilized as the activation function. The proposed EDNN's performance is examined with the already available algorithms in the experimental analysis. Based on the RMSE, MAE, MAPE, and R2 metrics, the performance of the proposed EDNN is compared to the existing DNN, CNN, ANN, and SVM methods. Further, according to the metrices, the proposed EDNN performs better. Moreover, the effectiveness of proposed EDNN is examined based on the accuracy, precision, recall, and F-Measure metrics. With the already-existing algorithms i.e., JO, GWO, PSO, and GA, the fitness for the proposed SMOJO algorithm is also examined. The proposed SMOJO algorithm achieves a higher fitness value than the already available algorithm.

Aviation Safety Mandatory Report Topic Prediction Model using Latent Dirichlet Allocation (LDA) (잠재 디리클레 할당(LDA)을 이용한 항공안전 의무보고 토픽 예측 모형)

  • Jun Hwan Kim;Hyunjin Paek;Sungjin Jeon;Young Jae Choi
    • Journal of the Korean Society for Aviation and Aeronautics
    • /
    • v.31 no.3
    • /
    • pp.42-49
    • /
    • 2023
  • Not only in aviation industry but also in other industries, safety data plays a key role to improve the level of safety performance. By analyzing safety data such as aviation safety report (text data), hazard can be identified and removed before it leads to a tragic accident. However, pre-processing of raw data (or natural language data) collected from each site should be carried out first to utilize proactive or predictive safety management system. As air traffic volume increases, the amount of data accumulated is also on the rise. Accordingly, there are clear limitation in analyzing data directly by manpower. In this paper, a topic prediction model for aviation safety mandatory report is proposed. In addition, the prediction accuracy of the proposed model was also verified using actual aviation safety mandatory report data. This research model is meaningful in that it not only effectively supports the current aviation safety mandatory report analysis work, but also can be applied to various data produced in the aviation safety field in the future.