• 제목/요약/키워드: Preprocessed data

검색결과 188건 처리시간 0.029초

Activity recognition of stroke-affected people using wearable sensor

  • Anusha David;Rajavel Ramadoss;Amutha Ramachandran;Shoba Sivapatham
    • ETRI Journal
    • /
    • 제45권6호
    • /
    • pp.1079-1089
    • /
    • 2023
  • Stroke is one of the leading causes of long-term disability worldwide, placing huge burdens on individuals and society. Further, automatic human activity recognition is a challenging task that is vital to the future of healthcare and physical therapy. Using a baseline long short-term memory recurrent neural network, this study provides a novel dataset of stretching, upward stretching, flinging motions, hand-to-mouth movements, swiping gestures, and pouring motions for improved model training and testing of stroke-affected patients. A MATLAB application is used to output textual and audible prediction results. A wearable sensor with a triaxial accelerometer is used to collect preprocessed real-time data. The model is trained with features extracted from the actual patient to recognize new actions, and the recognition accuracy provided by multiple datasets is compared based on the same baseline model. When training and testing using the new dataset, the baseline model shows recognition accuracy that is 11% higher than the Activity Daily Living dataset, 22% higher than the Activity Recognition Single Chest-Mounted Accelerometer dataset, and 10% higher than another real-world dataset.

불균형 클래스에서 AutoML 기반 분류 모델의 성능 향상을 위한 데이터 처리 (Data Processing of AutoML-based Classification Models for Improving Performance in Unbalanced Classes)

  • 이동준;강지수;정경용
    • 융합정보논문지
    • /
    • 제11권6호
    • /
    • pp.49-54
    • /
    • 2021
  • 최근 스마트 헬스케어 기술의 발전에 따라 일상적인 질환에 대한 관심이 증가하고 있다. 이에 따라 헬스케어 데이터를 통해 예측 모델로 질병을 분석하거나 예측하는 연구들이 증가하고 있다. 그러나 헬스케어 데이터에는 양성 데이터와 음성 데이터의 불균형이 존재한다. 이는 특정 질환을 가진 환자에 비하여 상대적으로 환자가 아닌 사람이 많아 데이터 수집에 어려움이 있어 발생하는 현상이다. 데이터 불균형은 질병 예측 및 탐지 시 진행하는 모델의 성능에 영향을 끼치기 때문에 이를 제거할 필요가 있다. 따라서 본 연구에서는 오버샘플링과 결측값 대치를 통해서 데이터 불균형을 해소한다. AutoML을 기반으로 여러 모델의 성능을 파악하고 모델 중 상위 3개의 모델을 앙상블한다.

APPLICATION OF SIR-C DATA FOR EXPLORATION OF MINERALIZEDD ZONES (HWANGGANG-Rl, KOREA)

  • Jiang, Wei W.;Park, S.W.;Park, Jeong-Ho;Lee, Cahng-Won;Kim, Duk-Jin;So, Byung-Han;So, C. S.;Moon, Wooil M.
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 1999년도 Proceedings of International Symposium on Remote Sensing
    • /
    • pp.158-164
    • /
    • 1999
  • This paper investigated and evaluated the NASA's Shuttle Imaging Radar-C (SIR-C) multiple frequency SAR data for differential backscattering effects of microwave from the surface geological materials overlying the skarn type mineralization. Although an integrated approach in mineral exploration is more cost effective and is well in use, there are still many technical and scientific issues to be further investigated and researched. In this study we have reprocessed several sets of previously surveyed exploration data and experimented with fuzzy logic digital fusion of the preprocessed data with respect to chosen exploration targets. Among the numerous fuzzy logic operators, which are currently available for a data driven integrated exploration strategy, we used varying combinations of fuzzy MIN, fuzzy MAX, and fuzzy SUM operators along with Gamma operator for fusion of exploration data, including the contact metamorphic zone information. The final exploration target tested was a skarn type W-Mo-F mineralization in the study area. The fuzzy logic derived mineral potential anomaly almost exactly matched the differential backscattering anomalies on the C-band and L-band SIR_C data when overlaid on each other. Although this high degree of correlation between these two data sets is remarkable, the differential backscattering anomaly over the skarn type W-Mo-F mineralization in the study area requires further investigation.

  • PDF

광업 데이터의 시계열 분석을 통해 실리카 농도를 예측하기 위한 머신러닝 모델 (A Machine Learning Model for Predicting Silica Concentrations through Time Series Analysis of Mining Data)

  • 이승훈;윤연아;정진형;심현수;장태우;김용수
    • 품질경영학회지
    • /
    • 제48권3호
    • /
    • pp.511-520
    • /
    • 2020
  • Purpose: The purpose of this study was to devise an accurate machine learning model for predicting silica concentrations following the addition of impurities, through time series analysis of mining data. Methods: The mining data were preprocessed and subjected to time series analysis using the machine learning model. Through correlation analysis, valid variables were selected and meaningless variables were excluded. To reflect changes over time, dependent variables at baseline were treated as independent variables at later time points. The relationship between independent variables and the dependent variable after n point was subjected to Pearson correlation analysis. Results: The correlation (R2) was strongest after 3 hours, which was adopted as a dependent variable. According to root mean square error (RMSE) data, the proposed method was superior to the other machine learning methods. The XGboost algorithm showed the best predictive performance. Conclusion: This study is important given the current lack of machine learning studies pertaining to the domestic mining industry. In addition, using time series analysis in mining data will show further improvement. Before establishing a predictive model for the proposed method, predictions should be made using data with time series characteristics. After doing this work, it should also improve prediction accuracy in other domains.

Prediction of ship power based on variation in deep feed-forward neural network

  • Lee, June-Beom;Roh, Myung-Il;Kim, Ki-Su
    • International Journal of Naval Architecture and Ocean Engineering
    • /
    • 제13권1호
    • /
    • pp.641-649
    • /
    • 2021
  • Fuel oil consumption (FOC) must be minimized to determine the economic route of a ship; hence, the ship power must be predicted prior to route planning. For this purpose, a numerical method using test results of a model has been widely used. However, predicting ship power using this method is challenging owing to the uncertainty of the model test. An onboard test should be conducted to solve this problem; however, it requires considerable resources and time. Therefore, in this study, a deep feed-forward neural network (DFN) is used to predict ship power using deep learning methods that involve data pattern recognition. To use data in the DFN, the input data and a label (output of prediction) should be configured. In this study, the input data are configured using ocean environmental data (wave height, wave period, wave direction, wind speed, wind direction, and sea surface temperature) and the ship's operational data (draft, speed, and heading). The ship power is selected as the label. In addition, various treatments have been used to improve the prediction accuracy. First, ocean environmental data related to wind and waves are preprocessed using values relative to the ship's velocity. Second, the structure of the DFN is changed based on the characteristics of the input data. Third, the prediction accuracy is analyzed using a combination comprising five hyperparameters (number of hidden layers, number of hidden nodes, learning rate, dropout, and gradient optimizer). Finally, k-means clustering is performed to analyze the effect of the sea state and ship operational status by categorizing it into several models. The performances of various prediction models are compared and analyzed using the DFN in this study.

Network Anomaly Traffic Detection Using WGAN-CNN-BiLSTM in Big Data Cloud-Edge Collaborative Computing Environment

  • Yue Wang
    • Journal of Information Processing Systems
    • /
    • 제20권3호
    • /
    • pp.375-390
    • /
    • 2024
  • Edge computing architecture has effectively alleviated the computing pressure on cloud platforms, reduced network bandwidth consumption, and improved the quality of service for user experience; however, it has also introduced new security issues. Existing anomaly detection methods in big data scenarios with cloud-edge computing collaboration face several challenges, such as sample imbalance, difficulty in dealing with complex network traffic attacks, and difficulty in effectively training large-scale data or overly complex deep-learning network models. A lightweight deep-learning model was proposed to address these challenges. First, normalization on the user side was used to preprocess the traffic data. On the edge side, a trained Wasserstein generative adversarial network (WGAN) was used to supplement the data samples, which effectively alleviates the imbalance issue of a few types of samples while occupying a small amount of edge-computing resources. Finally, a trained lightweight deep learning network model is deployed on the edge side, and the preprocessed and expanded local data are used to fine-tune the trained model. This ensures that the data of each edge node are more consistent with the local characteristics, effectively improving the system's detection ability. In the designed lightweight deep learning network model, two sets of convolutional pooling layers of convolutional neural networks (CNN) were used to extract spatial features. The bidirectional long short-term memory network (BiLSTM) was used to collect time sequence features, and the weight of traffic features was adjusted through the attention mechanism, improving the model's ability to identify abnormal traffic features. The proposed model was experimentally demonstrated using the NSL-KDD, UNSW-NB15, and CIC-ISD2018 datasets. The accuracies of the proposed model on the three datasets were as high as 0.974, 0.925, and 0.953, respectively, showing superior accuracy to other comparative models. The proposed lightweight deep learning network model has good application prospects for anomaly traffic detection in cloud-edge collaborative computing architectures.

Water level forecasting for extended lead times using preprocessed data with variational mode decomposition: A case study in Bangladesh

  • Shabbir Ahmed Osmani;Roya Narimani;Hoyoung Cha;Changhyun Jun;Md Asaduzzaman Sayef
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2023년도 학술발표회
    • /
    • pp.179-179
    • /
    • 2023
  • This study suggests a new approach of water level forecasting for extended lead times using original data preprocessing with variational mode decomposition (VMD). Here, two machine learning algorithms including light gradient boosting machine (LGBM) and random forest (RF) were considered to incorporate extended lead times (i.e., 5, 10, 15, 20, 25, 30, 40, and 50 days) forecasting of water levels. At first, the original data at two water level stations (i.e., SW173 and SW269 in Bangladesh) and their decomposed data from VMD were prepared on antecedent lag times to analyze in the datasets of different lead times. Mean absolute error (MAE), root mean squared error (RMSE), and mean squared error (MSE) were used to evaluate the performance of the machine learning models in water level forecasting. As results, it represents that the errors were minimized when the decomposed datasets were considered to predict water levels, rather than the use of original data standalone. It was also noted that LGBM produced lower MAE, RMSE, and MSE values than RF, indicating better performance. For instance, at the SW173 station, LGBM outperformed RF in both decomposed and original data with MAE values of 0.511 and 1.566, compared to RF's MAE values of 0.719 and 1.644, respectively, in a 30-day lead time. The models' performance decreased with increasing lead time, as per the study findings. In summary, preprocessing original data and utilizing machine learning models with decomposed techniques have shown promising results for water level forecasting in higher lead times. It is expected that the approach of this study can assist water management authorities in taking precautionary measures based on forecasted water levels, which is crucial for sustainable water resource utilization.

  • PDF

Virtual Metrology for predicting $SiO_2$ Etch Rate Using Optical Emission Spectroscopy Data

  • Kim, Boom-Soo;Kang, Tae-Yoon;Chun, Sang-Hyun;Son, Seung-Nam;Hong, Sang-Jeen
    • 한국진공학회:학술대회논문집
    • /
    • 한국진공학회 2009년도 제38회 동계학술대회 초록집
    • /
    • pp.464-464
    • /
    • 2010
  • A few years ago, for maintaining high stability and production yield of production equipment in a semiconductor fab, on-line monitoring of wafers is required, so that semiconductor manufacturers are investigating a software based process controlling scheme known as virtual metrology (VM). As semiconductor technology develops, the cost of fabrication tool/facility has reached its budget limit, and reducing metrology cost can obviously help to keep semiconductor manufacturing cost. By virtue of prediction, VM enables wafer-level control (or even down to site level), reduces within-lot variability, and increases process capability, $C_{pk}$. In this research, we have practiced VM on $SiO_2$ etch rate with optical emission spectroscopy(OES) data acquired in-situ while the process parameters are simultaneously correlated. To build process model of $SiO_2$ via, we first performed a series of etch runs according to the statistically designed experiment, called design of experiments (DOE). OES data are automatically logged with etch rate, and some OES spectra that correlated with $SiO_2$ etch rate is selected. Once the feature of OES data is selected, the preprocessed OES spectra is then used for in-situ sensor based VM modeling. ICP-RIE using 葰.56MHz, manufactured by Plasmart, Ltd. is employed in this experiment, and single fiber-optic attached for in-situ OES data acquisition. Before applying statistical feature selection, empirical feature selection of OES data is initially performed in order not to fall in a statistical misleading, which causes from random noise or large variation of insignificantly correlated responses with process itself. The accuracy of the proposed VM is still need to be developed in order to successfully replace the existing metrology, but it is no doubt that VM can support engineering decision of "go or not go" in the consecutive processing step.

  • PDF

A Melon Fruit Grading Machine Using a Miniature VIS/NIR Spectrometer: 1. Calibration Models for the Prediction of Soluble Solids Content and Firmness

  • Suh, Sang-Ryong;Lee, Kyeong-Hwan;Yu, Seung-Hwa;Shin, Hwa-Sun;Choi, Young-Soo;Yoo, Soo-Nam
    • Journal of Biosystems Engineering
    • /
    • 제37권3호
    • /
    • pp.166-176
    • /
    • 2012
  • Purpose: This study was conducted to investigate the potential of interactance mode of NIR spectroscopy technology for the estimation of soluble solids content (SSC) and firmness of muskmelons. Methods: Melon samples were taken from local greenhouses in three different harvesting seasons (experiments 1, 2, and 3). The fruit attributes were measured at the 6 points on an equator of each sample where the spectral data were collected. The prediction models were developed using the original spectral data and the spectral data sets preprocessed by 20 methods. The performance of the models was compared. Results: In the prediction of SSC, the highest coefficient of determination ($R_{cv}{^2}$) values of the cross-validation was 0.755 (standard error of prediction, SEP=$0.89^{\circ}Brix$) with the preprocessing of normalization with range in experiment 1. The highest coefficient of determination in the robustness tests, $R_{rt}{^2}$=0.650 (SEP=$1.03^{\circ}Brix$), was found when the best model of experiment 3 was evaluated with the data set of experiment 2. The best $R_{cv}{^2}$ for the prediction of firmness was 0.715 (SEP=3.63 N) when no preprocessing was applied in experiment 1. The highest $R_{rt}{^2}$ was 0.404 (SEP=5.30 N) when the best model of experiment 3 was applied to the data set of experiment 1. Conclusions: From the test results, it can be concluded that the interactance mode of VIS/NIR spectroscopy technology has a great potential to measure SSC and firmness of thick-skinned muskmelons.

웹서비스와 스마트폰앱을 이용한 연안해양모델 예측자료의 시각화시스템 구현 (Geovisualization of Coastal Ocean Model Data Using Web Services and Smartphone Apps)

  • 김형우;구본호;우승범;이호상;이양원
    • Spatial Information Research
    • /
    • 제22권2호
    • /
    • pp.63-71
    • /
    • 2014
  • 최근 해양레포츠 산업이 블루오션으로 떠오르고 있는데, 해양레포츠는 조류, 수온, 염도 등과 같은 다양한 환경조건에 영향을 받기 때문에 관측자료 뿐만 아니라 모델 예측자료도 매우 필요한 정보이다. 본 연구에서는 연안해양모델인 FVCOM(Finite Volume Coastal Ocean Model)에서 산출된 예측자료를 웹 및 스마트폰을 통해 제공하는 시각화시스템을 구현하였다. 이를 위하여 FVCOM 자료에 내삽과 샘플링 등의 전처리를 하여, 조위, 수온, 염도의 래스터 이미지와 조류(유속, 유향)의 벡터 데이터베이스를 구축하였고, 스프링 프레임워크(Spring Framework)를 활용하여 REST(Representational State Transfer) 기반의 API(Application Programming Interface)를 제공하는 웹서비스를 구축하였다. 또한 데이터베이스 자료를 데스크톱 및 이기종의 스마트폰에 탑재시킴으로써 크로스플랫폼(cross-platform) 시각화 환경을 실현하였다.