• Title/Summary/Keyword: long-term memory

Search Result 782, Processing Time 0.026 seconds

KOSPI index prediction using topic modeling and LSTM

  • Jin-Hyeon Joo;Geun-Duk Park
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.7
    • /
    • pp.73-80
    • /
    • 2024
  • In this paper, we proposes a method to improve the accuracy of predicting the Korea Composite Stock Price Index (KOSPI) by combining topic modeling and Long Short-Term Memory (LSTM) neural networks. In this paper, we use the Latent Dirichlet Allocation (LDA) technique to extract ten major topics related to interest rate increases and decreases from financial news data. The extracted topics, along with historical KOSPI index data, are input into an LSTM model to predict the KOSPI index. The proposed model has the characteristic of predicting the KOSPI index by combining the time series prediction method by inputting the historical KOSPI index into the LSTM model and the topic modeling method by inputting news data. To verify the performance of the proposed model, this paper designs four models (LSTM_K model, LSTM_KNS model, LDA_K model, LDA_KNS model) based on the types of input data for the LSTM and presents the predictive performance of each model. The comparison of prediction performance results shows that the LSTM model (LDA_K model), which uses financial news topic data and historical KOSPI index data as inputs, recorded the lowest RMSE (Root Mean Square Error), demonstrating the best predictive performance.

A Method for Generating Malware Countermeasure Samples Based on Pixel Attention Mechanism

  • Xiangyu Ma;Yuntao Zhao;Yongxin Feng;Yutao Hu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.2
    • /
    • pp.456-477
    • /
    • 2024
  • With information technology's rapid development, the Internet faces serious security problems. Studies have shown that malware has become a primary means of attacking the Internet. Therefore, adversarial samples have become a vital breakthrough point for studying malware. By studying adversarial samples, we can gain insights into the behavior and characteristics of malware, evaluate the performance of existing detectors in the face of deceptive samples, and help to discover vulnerabilities and improve detection methods for better performance. However, existing adversarial sample generation methods still need help regarding escape effectiveness and mobility. For instance, researchers have attempted to incorporate perturbation methods like Fast Gradient Sign Method (FGSM), Projected Gradient Descent (PGD), and others into adversarial samples to obfuscate detectors. However, these methods are only effective in specific environments and yield limited evasion effectiveness. To solve the above problems, this paper proposes a malware adversarial sample generation method (PixGAN) based on the pixel attention mechanism, which aims to improve adversarial samples' escape effect and mobility. The method transforms malware into grey-scale images and introduces the pixel attention mechanism in the Deep Convolution Generative Adversarial Networks (DCGAN) model to weigh the critical pixels in the grey-scale map, which improves the modeling ability of the generator and discriminator, thus enhancing the escape effect and mobility of the adversarial samples. The escape rate (ASR) is used as an evaluation index of the quality of the adversarial samples. The experimental results show that the adversarial samples generated by PixGAN achieve escape rates of 97%, 94%, 35%, 39%, and 43% on the Random Forest (RF), Support Vector Machine (SVM), Convolutional Neural Network (CNN), Convolutional Neural Network and Recurrent Neural Network (CNN_RNN), and Convolutional Neural Network and Long Short Term Memory (CNN_LSTM) algorithmic detectors, respectively.

Comparison of regression model and LSTM-RNN model in predicting deterioration of prestressed concrete box girder bridges

  • Gao Jing;Lin Ruiying;Zhang Yao
    • Structural Engineering and Mechanics
    • /
    • v.91 no.1
    • /
    • pp.39-47
    • /
    • 2024
  • Bridge deterioration shows the change of bridge condition during its operation, and predicting bridge deterioration is important for implementing predictive protection and planning future maintenance. However, in practical application, the raw inspection data of bridges are not continuous, which has a greater impact on the accuracy of the prediction results. Therefore, two kinds of bridge deterioration models are established in this paper: one is based on the traditional regression theory, combined with the distribution fitting theory to preprocess the data, which solves the problem of irregular distribution and incomplete quantity of raw data. Secondly, based on the theory of Long Short-Term Memory (LSTM) Recurrent Neural Network (RNN), the network is trained using the raw inspection data, which can realize the prediction of the future deterioration of bridges through the historical data. And the inspection data of 60 prestressed concrete box girder bridges in Xiamen, China are used as an example for validation and comparative analysis, and the results show that both deterioration models can predict the deterioration of prestressed concrete box girder bridges. The regression model shows that the bridge deteriorates gradually, while the LSTM-RNN model shows that the bridge keeps great condition during the first 5 years and degrades rapidly from 5 years to 15 years. Based on the current inspection database, the LSTM-RNN model performs better than the regression model because it has smaller prediction error. With the continuous improvement of the database, the results of this study can be extended to other bridge types or other degradation factors can be introduced to improve the accuracy and usefulness of the deterioration model.

Network Anomaly Traffic Detection Using WGAN-CNN-BiLSTM in Big Data Cloud-Edge Collaborative Computing Environment

  • Yue Wang
    • Journal of Information Processing Systems
    • /
    • v.20 no.3
    • /
    • pp.375-390
    • /
    • 2024
  • Edge computing architecture has effectively alleviated the computing pressure on cloud platforms, reduced network bandwidth consumption, and improved the quality of service for user experience; however, it has also introduced new security issues. Existing anomaly detection methods in big data scenarios with cloud-edge computing collaboration face several challenges, such as sample imbalance, difficulty in dealing with complex network traffic attacks, and difficulty in effectively training large-scale data or overly complex deep-learning network models. A lightweight deep-learning model was proposed to address these challenges. First, normalization on the user side was used to preprocess the traffic data. On the edge side, a trained Wasserstein generative adversarial network (WGAN) was used to supplement the data samples, which effectively alleviates the imbalance issue of a few types of samples while occupying a small amount of edge-computing resources. Finally, a trained lightweight deep learning network model is deployed on the edge side, and the preprocessed and expanded local data are used to fine-tune the trained model. This ensures that the data of each edge node are more consistent with the local characteristics, effectively improving the system's detection ability. In the designed lightweight deep learning network model, two sets of convolutional pooling layers of convolutional neural networks (CNN) were used to extract spatial features. The bidirectional long short-term memory network (BiLSTM) was used to collect time sequence features, and the weight of traffic features was adjusted through the attention mechanism, improving the model's ability to identify abnormal traffic features. The proposed model was experimentally demonstrated using the NSL-KDD, UNSW-NB15, and CIC-ISD2018 datasets. The accuracies of the proposed model on the three datasets were as high as 0.974, 0.925, and 0.953, respectively, showing superior accuracy to other comparative models. The proposed lightweight deep learning network model has good application prospects for anomaly traffic detection in cloud-edge collaborative computing architectures.

A Deep Learning Based Approach to Recognizing Accompanying Status of Smartphone Users Using Multimodal Data (스마트폰 다종 데이터를 활용한 딥러닝 기반의 사용자 동행 상태 인식)

  • Kim, Kilho;Choi, Sangwoo;Chae, Moon-jung;Park, Heewoong;Lee, Jaehong;Park, Jonghun
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.163-177
    • /
    • 2019
  • As smartphones are getting widely used, human activity recognition (HAR) tasks for recognizing personal activities of smartphone users with multimodal data have been actively studied recently. The research area is expanding from the recognition of the simple body movement of an individual user to the recognition of low-level behavior and high-level behavior. However, HAR tasks for recognizing interaction behavior with other people, such as whether the user is accompanying or communicating with someone else, have gotten less attention so far. And previous research for recognizing interaction behavior has usually depended on audio, Bluetooth, and Wi-Fi sensors, which are vulnerable to privacy issues and require much time to collect enough data. Whereas physical sensors including accelerometer, magnetic field and gyroscope sensors are less vulnerable to privacy issues and can collect a large amount of data within a short time. In this paper, a method for detecting accompanying status based on deep learning model by only using multimodal physical sensor data, such as an accelerometer, magnetic field and gyroscope, was proposed. The accompanying status was defined as a redefinition of a part of the user interaction behavior, including whether the user is accompanying with an acquaintance at a close distance and the user is actively communicating with the acquaintance. A framework based on convolutional neural networks (CNN) and long short-term memory (LSTM) recurrent networks for classifying accompanying and conversation was proposed. First, a data preprocessing method which consists of time synchronization of multimodal data from different physical sensors, data normalization and sequence data generation was introduced. We applied the nearest interpolation to synchronize the time of collected data from different sensors. Normalization was performed for each x, y, z axis value of the sensor data, and the sequence data was generated according to the sliding window method. Then, the sequence data became the input for CNN, where feature maps representing local dependencies of the original sequence are extracted. The CNN consisted of 3 convolutional layers and did not have a pooling layer to maintain the temporal information of the sequence data. Next, LSTM recurrent networks received the feature maps, learned long-term dependencies from them and extracted features. The LSTM recurrent networks consisted of two layers, each with 128 cells. Finally, the extracted features were used for classification by softmax classifier. The loss function of the model was cross entropy function and the weights of the model were randomly initialized on a normal distribution with an average of 0 and a standard deviation of 0.1. The model was trained using adaptive moment estimation (ADAM) optimization algorithm and the mini batch size was set to 128. We applied dropout to input values of the LSTM recurrent networks to prevent overfitting. The initial learning rate was set to 0.001, and it decreased exponentially by 0.99 at the end of each epoch training. An Android smartphone application was developed and released to collect data. We collected smartphone data for a total of 18 subjects. Using the data, the model classified accompanying and conversation by 98.74% and 98.83% accuracy each. Both the F1 score and accuracy of the model were higher than the F1 score and accuracy of the majority vote classifier, support vector machine, and deep recurrent neural network. In the future research, we will focus on more rigorous multimodal sensor data synchronization methods that minimize the time stamp differences. In addition, we will further study transfer learning method that enables transfer of trained models tailored to the training data to the evaluation data that follows a different distribution. It is expected that a model capable of exhibiting robust recognition performance against changes in data that is not considered in the model learning stage will be obtained.

Prediction of Urban Flood Extent by LSTM Model and Logistic Regression (LSTM 모형과 로지스틱 회귀를 통한 도시 침수 범위의 예측)

  • Kim, Hyun Il;Han, Kun Yeun;Lee, Jae Yeong
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.40 no.3
    • /
    • pp.273-283
    • /
    • 2020
  • Because of climate change, the occurrence of localized and heavy rainfall is increasing. It is important to predict floods in urban areas that have suffered inundation in the past. For flood prediction, not only numerical analysis models but also machine learning-based models can be applied. The LSTM (Long Short-Term Memory) neural network used in this study is appropriate for sequence data, but it demands a lot of data. However, rainfall that causes flooding does not appear every year in a single urban basin, meaning it is difficult to collect enough data for deep learning. Therefore, in addition to the rainfall observed in the study area, the observed rainfall in another urban basin was applied in the predictive model. The LSTM neural network was used for predicting the total overflow, and the result of the SWMM (Storm Water Management Model) was applied as target data. The prediction of the inundation map was performed by using logistic regression; the independent variable was the total overflow and the dependent variable was the presence or absence of flooding in each grid. The dependent variable of logistic regression was collected through the simulation results of a two-dimensional flood model. The input data of the two-dimensional flood model were the overflow at each manhole calculated by the SWMM. According to the LSTM neural network parameters, the prediction results of total overflow were compared. Four predictive models were used in this study depending on the parameter of the LSTM. The average RMSE (Root Mean Square Error) for verification and testing was 1.4279 ㎥/s, 1.0079 ㎥/s for the four LSTM models. The minimum RMSE of the verification and testing was calculated as 1.1655 ㎥/s and 0.8797 ㎥/s. It was confirmed that the total overflow can be predicted similarly to the SWMM simulation results. The prediction of inundation extent was performed by linking the logistic regression with the results of the LSTM neural network, and the maximum area fitness was 97.33 % when more than 0.5 m depth was considered. The methodology presented in this study would be helpful in improving urban flood response based on deep learning methodology.

A study on the Linkage of Volatility in Stock Markets under Global Financial Crisis (글로벌 금융위기하에서 주식시장 변동성의 연관성에 대한 연구)

  • Lee, Kyung-Hee;Kim, Kyung-Soo
    • Management & Information Systems Review
    • /
    • v.33 no.1
    • /
    • pp.139-155
    • /
    • 2014
  • This study is to examine the linkage of volatility between changes in the stock market of India and other countries through the integration of the world economy. The results were as follows: First, autocorrelation or serial correlation did not exist in the classic RS model, but long-term memory was present in the modified RS model. Second, unit root did not exist in the unit root test for all periods, and the series were a stable explanatory power and a long-term memory with the normal conditions in the ARFIMA model. Third, in the multivariate asymmetric BEKK and VAR model before the financial crisis, it showed that there was a strong influence of the own market of Taiwan and UK in the conditional mean equation, and a strong spillover effect from Japan to India, from Taiwan to China(Korea, US), from US(Japan) to UK in one direction. In the conditional variance equation, GARCH showed a strong spillover effect that indicated the same direction as the result of ARCH coefficient of the market itself. Asymmetric effects in three home markets and between markets existed. Fourth, after the financial crisis, in the conditional mean equation, only the domestic market in Taiwan showed strong influences, and strong spillover effects existed from India to US, from Taiwan to Japan, from Korea to Germany in one direction. In the conditional variance equation, strong spillover effects were the same as the result of the pre-crisis and asymmetric effect in the domestic market in UK was present, and one-way asymmetric effect existed in Germany from Taiwan. Therefore, the results of this study presented the linkage between the volatilities of the stock market of India and other countries through the integration of the world economy, observing and confirming the asymmetric reactions and return(volatility) spillover effects between the stock market of India and other countries.

  • PDF

Comparison of Models for Stock Price Prediction Based on Keyword Search Volume According to the Social Acceptance of Artificial Intelligence (인공지능의 사회적 수용도에 따른 키워드 검색량 기반 주가예측모형 비교연구)

  • Cho, Yujung;Sohn, Kwonsang;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.103-128
    • /
    • 2021
  • Recently, investors' interest and the influence of stock-related information dissemination are being considered as significant factors that explain stock returns and volume. Besides, companies that develop, distribute, or utilize innovative new technologies such as artificial intelligence have a problem that it is difficult to accurately predict a company's future stock returns and volatility due to macro-environment and market uncertainty. Market uncertainty is recognized as an obstacle to the activation and spread of artificial intelligence technology, so research is needed to mitigate this. Hence, the purpose of this study is to propose a machine learning model that predicts the volatility of a company's stock price by using the internet search volume of artificial intelligence-related technology keywords as a measure of the interest of investors. To this end, for predicting the stock market, we using the VAR(Vector Auto Regression) and deep neural network LSTM (Long Short-Term Memory). And the stock price prediction performance using keyword search volume is compared according to the technology's social acceptance stage. In addition, we also conduct the analysis of sub-technology of artificial intelligence technology to examine the change in the search volume of detailed technology keywords according to the technology acceptance stage and the effect of interest in specific technology on the stock market forecast. To this end, in this study, the words artificial intelligence, deep learning, machine learning were selected as keywords. Next, we investigated how many keywords each week appeared in online documents for five years from January 1, 2015, to December 31, 2019. The stock price and transaction volume data of KOSDAQ listed companies were also collected and used for analysis. As a result, we found that the keyword search volume for artificial intelligence technology increased as the social acceptance of artificial intelligence technology increased. In particular, starting from AlphaGo Shock, the keyword search volume for artificial intelligence itself and detailed technologies such as machine learning and deep learning appeared to increase. Also, the keyword search volume for artificial intelligence technology increases as the social acceptance stage progresses. It showed high accuracy, and it was confirmed that the acceptance stages showing the best prediction performance were different for each keyword. As a result of stock price prediction based on keyword search volume for each social acceptance stage of artificial intelligence technologies classified in this study, the awareness stage's prediction accuracy was found to be the highest. The prediction accuracy was different according to the keywords used in the stock price prediction model for each social acceptance stage. Therefore, when constructing a stock price prediction model using technology keywords, it is necessary to consider social acceptance of the technology and sub-technology classification. The results of this study provide the following implications. First, to predict the return on investment for companies based on innovative technology, it is most important to capture the recognition stage in which public interest rapidly increases in social acceptance of the technology. Second, the change in keyword search volume and the accuracy of the prediction model varies according to the social acceptance of technology should be considered in developing a Decision Support System for investment such as the big data-based Robo-advisor recently introduced by the financial sector.

LSTM Based Prediction of Ocean Mixed Layer Temperature Using Meteorological Data (기상 데이터를 활용한 LSTM 기반의 해양 혼합층 수온 예측)

  • Ko, Kwan-Seob;Kim, Young-Won;Byeon, Seong-Hyeon;Lee, Soo-Jin
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.3
    • /
    • pp.603-614
    • /
    • 2021
  • Recently, the surface temperature in the seas around Korea has been continuously rising. This temperature rise causes changes in fishery resources and affects leisure activities such as fishing. In particular, high temperatures lead to the occurrence of red tides, causing severe damage to ocean industries such as aquaculture. Meanwhile, changes in sea temperature are closely related to military operation to detect submarines. This is because the degree of diffraction, refraction, or reflection of sound waves used to detect submarines varies depending on the ocean mixed layer. Currently, research on the prediction of changes in sea water temperature is being actively conducted. However, existing research is focused on predicting only the surface temperature of the ocean, so it is difficult to identify fishery resources according to depth and apply them to military operations such as submarine detection. Therefore, in this study, we predicted the temperature of the ocean mixed layer at a depth of 38m by using temperature data for each water depth in the upper mixed layer and meteorological data such as temperature, atmospheric pressure, and sunlight that are related to the surface temperature. The data used are meteorological data and sea temperature data by water depth observed from 2016 to 2020 at the IEODO Ocean Research Station. In order to increase the accuracy and efficiency of prediction, LSTM (Long Short-Term Memory), which is known to be suitable for time series data among deep learning techniques, was used. As a result of the experiment, in the daily prediction, the RMSE (Root Mean Square Error) of the model using temperature, atmospheric pressure, and sunlight data together was 0.473. On the other hand, the RMSE of the model using only the surface temperature was 0.631. These results confirm that the model using meteorological data together shows better performance in predicting the temperature of the upper ocean mixed layer.

Application of deep learning method for decision making support of dam release operation (댐 방류 의사결정지원을 위한 딥러닝 기법의 적용성 평가)

  • Jung, Sungho;Le, Xuan Hien;Kim, Yeonsu;Choi, Hyungu;Lee, Giha
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.spc1
    • /
    • pp.1095-1105
    • /
    • 2021
  • The advancement of dam operation is further required due to the upcoming rainy season, typhoons, or torrential rains. Besides, physical models based on specific rules may sometimes have limitations in controlling the release discharge of dam due to inherent uncertainty and complex factors. This study aims to forecast the water level of the nearest station to the dam multi-timestep-ahead and evaluate the availability when it makes a decision for a release discharge of dam based on LSTM (Long Short-Term Memory) of deep learning. The LSTM model was trained and tested on eight data sets with a 1-hour temporal resolution, including primary data used in the dam operation and downstream water level station data about 13 years (2009~2021). The trained model forecasted the water level time series divided by the six lead times: 1, 3, 6, 9, 12, 18-hours, and compared and analyzed with the observed data. As a result, the prediction results of the 1-hour ahead exhibited the best performance for all cases with an average accuracy of MAE of 0.01m, RMSE of 0.015 m, and NSE of 0.99, respectively. In addition, as the lead time increases, the predictive performance of the model tends to decrease slightly. The model may similarly estimate and reliably predicts the temporal pattern of the observed water level. Thus, it is judged that the LSTM model could produce predictive data by extracting the characteristics of complex hydrological non-linear data and can be used to determine the amount of release discharge from the dam when simulating the operation of the dam.