• Title/Summary/Keyword: data pre-processing

Search Result 805, Processing Time 0.029 seconds

Classification of basin characteristics related to inundation using clustering (군집분석을 이용한 침수관련 유역특성 분류)

  • Lee, Han Seung;Cho, Jae Woong;Kang, Ho seon;Hwang, Jeong Geun;Moon, Hae Jin
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2020.06a
    • /
    • pp.96-96
    • /
    • 2020
  • In order to establish the risk criteria of inundation due to typhoons or heavy rainfall, research is underway to predict the limit rainfall using basin characteristics, limit rainfall and artificial intelligence algorithms. In order to improve the model performance in estimating the limit rainfall, the learning data are used after the pre-processing. When 50.0% of the entire data was removed as an outlier in the pre-processing process, it was confirmed that the accuracy is over 90%. However, the use rate of learning data is very low, so there is a limitation that various characteristics cannot be considered. Accordingly, in order to predict the limit rainfall reflecting various watershed characteristics by increasing the use rate of learning data, the watersheds with similar characteristics were clustered. The algorithms used for clustering are K-Means, Agglomerative, DBSCAN and Spectral Clustering. The k-Means, DBSCAN and Agglomerative clustering algorithms are clustered at the impervious area ratio, and the Spectral clustering algorithm is clustered in various forms depending on the parameters. If the results of the clustering algorithm are applied to the limit rainfall prediction algorithm, various watershed characteristics will be considered, and at the same time, the performance of predicting the limit rainfall will be improved.

  • PDF

Refined fixed granularity algorithm on Networks of Workstations (NOW 환경에서 개선된 고정 분할 단위 알고리즘)

  • Gu, Bon-Geun
    • The KIPS Transactions:PartA
    • /
    • v.8A no.2
    • /
    • pp.117-124
    • /
    • 2001
  • At NOW (Networks Of Workstations), the load sharing is very important role for improving the performance. The known load sharing strategy is fixed-granularity, variable-granularity and adaptive-granularity. The variable-granularity algorithm is sensitive to the various parameters. But Send algorithm, which implements the fixed-granularity strategy, is robust to task granularity. And the performance difference between Send and variable-granularity algorithm is not substantial. But, in Send algorithm, the computing time and the communication time are not overlapped. Therefore, long latency time at the network has influence on the execution time of the parallel program. In this paper, we propose the preSend algorithm. In the preSend algorithm, the master node can send the data to the slave nodes in advance without the waiting for partial results from the slaves. As the master node sent the next data to the slaves in advance, the slave nodes can process the data without the idle time. As stated above, the preSend algorithm can overlap the computing time and the communication time. Therefore we reduce the influence of the long latency time at the network and the execution time of the parallel program on the NOW. To compare the execution time of two algorithms, we use the $320{\times}320$ matrix multiplication. The comparison results of execution times show that the preSend algorithm has the shorter execution time than the Send algorithm.

  • PDF

Food Detection by Fine-Tuning Pre-trained Convolutional Neural Network Using Noisy Labels

  • Alshomrani, Shroog;Aljoudi, Lina;Aljabri, Banan;Al-Shareef, Sarah
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.7
    • /
    • pp.182-190
    • /
    • 2021
  • Deep learning is an advanced technology for large-scale data analysis, with numerous promising cases like image processing, object detection and significantly more. It becomes customarily to use transfer learning and fine-tune a pre-trained CNN model for most image recognition tasks. Having people taking photos and tag themselves provides a valuable resource of in-data. However, these tags and labels might be noisy as people who annotate these images might not be experts. This paper aims to explore the impact of noisy labels on fine-tuning pre-trained CNN models. Such effect is measured on a food recognition task using Food101 as a benchmark. Four pre-trained CNN models are included in this study: InceptionV3, VGG19, MobileNetV2 and DenseNet121. Symmetric label noise will be added with different ratios. In all cases, models based on DenseNet121 outperformed the other models. When noisy labels were introduced to the data, the performance of all models degraded almost linearly with the amount of added noise.

Design of Moving Object Query Processing Based on UDF (UDF 기반 이동객체 질의 처리 설계 및 구현)

  • Yoo, Kihyun;Yang, Pyoung Woo;Nam, Kwang Woo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.2
    • /
    • pp.85-90
    • /
    • 2017
  • Various mobile devices are spreading in recent developments in mobile computing environments. Especially the popularity of mobile devices equipped with GPS has become widespread, and various application services utilizing location information are born. In this paper, we propose a system model for storing and managing the trajectory of moving objects, which is the set of location information of moving objects acquired in continuous time, and the UDF (User-Defined Functions) based trajectory index method which can quickly query the large data set of moving object and the Pre-Materialized table method. Then we compare and evaluate the performance of each method through experiments. Experimental results show that the Pre-Materialized table method is about 1.2 times faster than the UDF based trajectory index method on execution time.

An Efficient Pre-computing Method for Processing Continuous Skyline Queries in Road Networks (도로망에서 연속적인 스카이라인 절의처리를 위한 효율적인 전처리기법)

  • Jang, Su-Min;Yoo, Jae-Soo
    • Journal of KIISE:Databases
    • /
    • v.36 no.4
    • /
    • pp.314-320
    • /
    • 2009
  • Skyline queries have recently received considerable attention in the searching services. The skyline contains interesting objects that are not dominated by any other objects on all dimensions. Many related works have processed a skyline on static data or on moving objects in Euclidean space. However, this paper assumes that the point of a skyline query continuously moves in road networks. We propose a new method that efficiently processes continuous skyline queries in road networks through pre-computed shortest range data of objects. Our experiments show that the proposed method is about 100 times faster than previous methods in terms of query processing time.

A Study on Heavy Rainfall Guidance Realized with the Aid of Neuro-Fuzzy and SVR Algorithm Using AWS Data (AWS자료 기반 SVR과 뉴로-퍼지 알고리즘 구현 호우주의보 가이던스 연구)

  • Kim, Hyun-Myung;Oh, Sung-Kwun;Kim, Yong-Hyuk;Lee, Yong-Hee
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.63 no.4
    • /
    • pp.526-533
    • /
    • 2014
  • In this study, we introduce design methodology to develop a guidance for issuing heavy rainfall warning by using both RBFNNs(Radial basis function neural networks) and SVR(Support vector regression) model, and then carry out the comparative studies between two pattern classifiers. Individual classifiers are designed as architecture realized with the aid of optimization and pre-processing algorithm. Because the predictive performance of the existing heavy rainfall forecast system is commonly affected from diverse processing techniques of meteorological data, under-sampling method as the pre-processing method of input data is used, and also data discretization and feature extraction method for SVR and FCM clustering and PSO method for RBFNNs are exploited respectively. The observed data, AWS(Automatic weather wtation), supplied from KMA(korea meteorological administration), is used for training and testing of the proposed classifiers. The proposed classifiers offer the related information to issue a heavy rain warning in advance before 1 to 3 hours by using the selected meteorological data and the cumulated precipitation amount accumulated for 1 to 12 hours from AWS data. For performance evaluation of each classifier, ETS(Equitable Threat Score) method is used as standard verification method for predictive ability. Through the comparative studies of two classifiers, neuro-fuzzy method is effectively used for improved performance and to show stable predictive result of guidance to issue heavy rainfall warning.

Pre-processing Scheme for Indoor Precision Tracking Based on Beacon (비콘 기반 실내 정밀 트래킹을 위한 전처리 기법)

  • Hwang, Yu Min;Jung, Jun Hee;Shim, Issac;Kim, Tae Woo;Kim, Jin Young
    • Journal of Satellite, Information and Communications
    • /
    • v.11 no.4
    • /
    • pp.58-62
    • /
    • 2016
  • In this paper, we propose a pre-processing scheme for improving indoor positioning accuracy in impulsive noise channel environments. The impulsive noise can be generated by multi-path fading effects by complicated indoor structures or interference environments, which causes an increase in demodulation error probability. The proposed pre-processing scheme is performed before a triangulation method to calculate user's position, and providing reliable input data demodulated from a received signal to the triangulation method. Therefore, we studied and proposed an adaptive threshold function for mitigation of the impulsive noise based on wavelet denoising. Through results of computer simulations for the proposed scheme, we confirmed that Bit Error Rate and Signal-to-Noise Ratio performance is improved compared to conventional schemes.

PC-based Processing of Shallow Marine Multi-channel Seismic Data (PC기반의 천해저 다중채널 탄성파 자료의 전산처리)

  • 공영세;김국주
    • 한국해양학회지
    • /
    • v.30 no.2
    • /
    • pp.116-124
    • /
    • 1995
  • Marine, shallow seismic data have been acquired and processed by newly developed multi-channel(6 channel), PC-based digital recording and processing system. The digital processing system includes pre-processing, swell-compensation filter, frequency filter, gain correction, deconvolution, stacking, migration, and plotting. The quality of processed sections is greatly enhanced in terms of signal-to-noise ratio and vertical/horizontal resolution. The multi-channel, digital recording, acquisition and processing system proved to be and economical, efficient and easy-to-use marine shallow seismic tool.

  • PDF

Research on Data Tuning Methods to Improve the Anomaly Detection Performance of Industrial Control Systems (산업제어시스템의 이상 탐지 성능 개선을 위한 데이터 보정 방안 연구)

  • JUN, SANGSO;Lee, Kyung-ho
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.4
    • /
    • pp.691-708
    • /
    • 2022
  • As the technology of machine learning and deep learning became common, it began to be applied to research on anomaly(abnormal) detection of industrial control systems. In Korea, the HAI dataset was developed and published to activate artificial intelligence research for abnormal detection of industrial control systems, and an AI contest for detecting industrial control system security threats is being conducted. Most of the anomaly detection studies have been to create a learning model with improved performance through the ensemble model method, which is applied either by modifying the existing deep learning algorithm or by applying it together with other algorithms. In this study, a study was conducted to improve the performance of anomaly detection with a post-processing method that detects abnormal data and corrects the labeling results, rather than the learning algorithm and data pre-processing process. Results It was confirmed that the results were improved by about 10% or more compared to the anomaly detection performance of the existing model.

Segmentation of underwater images using morphology for deep learning (딥러닝을 위한 모폴로지를 이용한 수중 영상의 세그먼테이션)

  • Ji-Eun Lee;Chul-Won Lee;Seok-Joon Park;Jea-Beom Shin;Hyun-Gi Jung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.4
    • /
    • pp.370-376
    • /
    • 2023
  • In the underwater image, it is not clear to distinguish the shape of the target due to underwater noise and low resolution. In addition, as an input of deep learning, underwater images require pre-processing and segmentation must be preceded. Even after pre-processing, the target is not clear, and the performance of detection and identification by deep learning may not be high. Therefore, it is necessary to distinguish and clarify the target. In this study, the importance of target shadows is confirmed in underwater images, object detection and target area acquisition by shadows, and data containing only the shape of targets and shadows without underwater background are generated. We present the process of converting the shadow image into a 3-mode image in which the target is white, the shadow is black, and the background is gray. Through this, it is possible to provide an image that is clearly pre-processed and easily discriminated as an input of deep learning. In addition, if the image processing code using Open Source Computer Vision (OpenCV)Library was used for processing, the processing speed was also suitable for real-time processing.