• Title/Summary/Keyword: Long Short-Term Memory (LSTM) Network

Search Result 279, Processing Time 0.029 seconds

Prediction of Music Generation on Time Series Using Bi-LSTM Model (Bi-LSTM 모델을 이용한 음악 생성 시계열 예측)

  • Kwangjin, Kim;Chilwoo, Lee
    • Smart Media Journal
    • /
    • v.11 no.10
    • /
    • pp.65-75
    • /
    • 2022
  • Deep learning is used as a creative tool that could overcome the limitations of existing analysis models and generate various types of results such as text, image, and music. In this paper, we propose a method necessary to preprocess audio data using the Niko's MIDI Pack sound source file as a data set and to generate music using Bi-LSTM. Based on the generated root note, the hidden layers are composed of multi-layers to create a new note suitable for the musical composition, and an attention mechanism is applied to the output gate of the decoder to apply the weight of the factors that affect the data input from the encoder. Setting variables such as loss function and optimization method are applied as parameters for improving the LSTM model. The proposed model is a multi-channel Bi-LSTM with attention that applies notes pitch generated from separating treble clef and bass clef, length of notes, rests, length of rests, and chords to improve the efficiency and prediction of MIDI deep learning process. The results of the learning generate a sound that matches the development of music scale distinct from noise, and we are aiming to contribute to generating a harmonistic stable music.

Forecasting Baltic Dry Index by Implementing Time-Series Decomposition and Data Augmentation Techniques (시계열 분해 및 데이터 증강 기법 활용 건화물운임지수 예측)

  • Han, Min Soo;Yu, Song Jin
    • Journal of Korean Society for Quality Management
    • /
    • v.50 no.4
    • /
    • pp.701-716
    • /
    • 2022
  • Purpose: This study aims to predict the dry cargo transportation market economy. The subject of this study is the BDI (Baltic Dry Index) time-series, an index representing the dry cargo transport market. Methods: In order to increase the accuracy of the BDI time-series, we have pre-processed the original time-series via time-series decomposition and data augmentation techniques and have used them for ANN learning. The ANN algorithms used are Multi-Layer Perceptron (MLP), Recurrent Neural Network (RNN), and Long Short-Term Memory (LSTM) to compare and analyze the case of learning and predicting by applying time-series decomposition and data augmentation techniques. The forecast period aims to make short-term predictions at the time of t+1. The period to be studied is from '22. 01. 07 to '22. 08. 26. Results: Only for the case of the MAPE (Mean Absolute Percentage Error) indicator, all ANN models used in the research has resulted in higher accuracy (1.422% on average) in multivariate prediction. Although it is not a remarkable improvement in prediction accuracy compared to uni-variate prediction results, it can be said that the improvement in ANN prediction performance has been achieved by utilizing time-series decomposition and data augmentation techniques that were significant and targeted throughout this study. Conclusion: Nevertheless, due to the nature of ANN, additional performance improvements can be expected according to the adjustment of the hyper-parameter. Therefore, it is necessary to try various applications of multiple learning algorithms and ANN optimization techniques. Such an approach would help solve problems with a small number of available data, such as the rapidly changing business environment or the current shipping market.

A Data-driven Classifier for Motion Detection of Soldiers on the Battlefield using Recurrent Architectures and Hyperparameter Optimization (순환 아키텍쳐 및 하이퍼파라미터 최적화를 이용한 데이터 기반 군사 동작 판별 알고리즘)

  • Joonho Kim;Geonju Chae;Jaemin Park;Kyeong-Won Park
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.107-119
    • /
    • 2023
  • The technology that recognizes a soldier's motion and movement status has recently attracted large attention as a combination of wearable technology and artificial intelligence, which is expected to upend the paradigm of troop management. The accuracy of state determination should be maintained at a high-end level to make sure of the expected vital functions both in a training situation; an evaluation and solution provision for each individual's motion, and in a combat situation; overall enhancement in managing troops. However, when input data is given as a timer series or sequence, existing feedforward networks would show overt limitations in maximizing classification performance. Since human behavior data (3-axis accelerations and 3-axis angular velocities) handled for military motion recognition requires the process of analyzing its time-dependent characteristics, this study proposes a high-performance data-driven classifier which utilizes the long-short term memory to identify the order dependence of acquired data, learning to classify eight representative military operations (Sitting, Standing, Walking, Running, Ascending, Descending, Low Crawl, and High Crawl). Since the accuracy is highly dependent on a network's learning conditions and variables, manual adjustment may neither be cost-effective nor guarantee optimal results during learning. Therefore, in this study, we optimized hyperparameters using Bayesian optimization for maximized generalization performance. As a result, the final architecture could reduce the error rate by 62.56% compared to the existing network with a similar number of learnable parameters, with the final accuracy of 98.39% for various military operations.

Symbolizing Numbers to Improve Neural Machine Translation (숫자 기호화를 통한 신경기계번역 성능 향상)

  • Kang, Cheongwoong;Ro, Youngheon;Kim, Jisu;Choi, Heeyoul
    • Journal of Digital Contents Society
    • /
    • v.19 no.6
    • /
    • pp.1161-1167
    • /
    • 2018
  • The development of machine learning has enabled machines to perform delicate tasks that only humans could do, and thus many companies have introduced machine learning based translators. Existing translators have good performances but they have problems in number translation. The translators often mistranslate numbers when the input sentence includes a large number. Furthermore, the output sentence structure completely changes even if only one number in the input sentence changes. In this paper, first, we optimized a neural machine translation model architecture that uses bidirectional RNN, LSTM, and the attention mechanism through data cleansing and changing the dictionary size. Then, we implemented a number-processing algorithm specialized in number translation and applied it to the neural machine translation model to solve the problems above. The paper includes the data cleansing method, an optimal dictionary size and the number-processing algorithm, as well as experiment results for translation performance based on the BLEU score.

A Method for Generating Malware Countermeasure Samples Based on Pixel Attention Mechanism

  • Xiangyu Ma;Yuntao Zhao;Yongxin Feng;Yutao Hu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.2
    • /
    • pp.456-477
    • /
    • 2024
  • With information technology's rapid development, the Internet faces serious security problems. Studies have shown that malware has become a primary means of attacking the Internet. Therefore, adversarial samples have become a vital breakthrough point for studying malware. By studying adversarial samples, we can gain insights into the behavior and characteristics of malware, evaluate the performance of existing detectors in the face of deceptive samples, and help to discover vulnerabilities and improve detection methods for better performance. However, existing adversarial sample generation methods still need help regarding escape effectiveness and mobility. For instance, researchers have attempted to incorporate perturbation methods like Fast Gradient Sign Method (FGSM), Projected Gradient Descent (PGD), and others into adversarial samples to obfuscate detectors. However, these methods are only effective in specific environments and yield limited evasion effectiveness. To solve the above problems, this paper proposes a malware adversarial sample generation method (PixGAN) based on the pixel attention mechanism, which aims to improve adversarial samples' escape effect and mobility. The method transforms malware into grey-scale images and introduces the pixel attention mechanism in the Deep Convolution Generative Adversarial Networks (DCGAN) model to weigh the critical pixels in the grey-scale map, which improves the modeling ability of the generator and discriminator, thus enhancing the escape effect and mobility of the adversarial samples. The escape rate (ASR) is used as an evaluation index of the quality of the adversarial samples. The experimental results show that the adversarial samples generated by PixGAN achieve escape rates of 97%, 94%, 35%, 39%, and 43% on the Random Forest (RF), Support Vector Machine (SVM), Convolutional Neural Network (CNN), Convolutional Neural Network and Recurrent Neural Network (CNN_RNN), and Convolutional Neural Network and Long Short Term Memory (CNN_LSTM) algorithmic detectors, respectively.

A Deep Learning Based Approach to Recognizing Accompanying Status of Smartphone Users Using Multimodal Data (스마트폰 다종 데이터를 활용한 딥러닝 기반의 사용자 동행 상태 인식)

  • Kim, Kilho;Choi, Sangwoo;Chae, Moon-jung;Park, Heewoong;Lee, Jaehong;Park, Jonghun
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.163-177
    • /
    • 2019
  • As smartphones are getting widely used, human activity recognition (HAR) tasks for recognizing personal activities of smartphone users with multimodal data have been actively studied recently. The research area is expanding from the recognition of the simple body movement of an individual user to the recognition of low-level behavior and high-level behavior. However, HAR tasks for recognizing interaction behavior with other people, such as whether the user is accompanying or communicating with someone else, have gotten less attention so far. And previous research for recognizing interaction behavior has usually depended on audio, Bluetooth, and Wi-Fi sensors, which are vulnerable to privacy issues and require much time to collect enough data. Whereas physical sensors including accelerometer, magnetic field and gyroscope sensors are less vulnerable to privacy issues and can collect a large amount of data within a short time. In this paper, a method for detecting accompanying status based on deep learning model by only using multimodal physical sensor data, such as an accelerometer, magnetic field and gyroscope, was proposed. The accompanying status was defined as a redefinition of a part of the user interaction behavior, including whether the user is accompanying with an acquaintance at a close distance and the user is actively communicating with the acquaintance. A framework based on convolutional neural networks (CNN) and long short-term memory (LSTM) recurrent networks for classifying accompanying and conversation was proposed. First, a data preprocessing method which consists of time synchronization of multimodal data from different physical sensors, data normalization and sequence data generation was introduced. We applied the nearest interpolation to synchronize the time of collected data from different sensors. Normalization was performed for each x, y, z axis value of the sensor data, and the sequence data was generated according to the sliding window method. Then, the sequence data became the input for CNN, where feature maps representing local dependencies of the original sequence are extracted. The CNN consisted of 3 convolutional layers and did not have a pooling layer to maintain the temporal information of the sequence data. Next, LSTM recurrent networks received the feature maps, learned long-term dependencies from them and extracted features. The LSTM recurrent networks consisted of two layers, each with 128 cells. Finally, the extracted features were used for classification by softmax classifier. The loss function of the model was cross entropy function and the weights of the model were randomly initialized on a normal distribution with an average of 0 and a standard deviation of 0.1. The model was trained using adaptive moment estimation (ADAM) optimization algorithm and the mini batch size was set to 128. We applied dropout to input values of the LSTM recurrent networks to prevent overfitting. The initial learning rate was set to 0.001, and it decreased exponentially by 0.99 at the end of each epoch training. An Android smartphone application was developed and released to collect data. We collected smartphone data for a total of 18 subjects. Using the data, the model classified accompanying and conversation by 98.74% and 98.83% accuracy each. Both the F1 score and accuracy of the model were higher than the F1 score and accuracy of the majority vote classifier, support vector machine, and deep recurrent neural network. In the future research, we will focus on more rigorous multimodal sensor data synchronization methods that minimize the time stamp differences. In addition, we will further study transfer learning method that enables transfer of trained models tailored to the training data to the evaluation data that follows a different distribution. It is expected that a model capable of exhibiting robust recognition performance against changes in data that is not considered in the model learning stage will be obtained.

Restoration of damaged speech files using deep neural networks (심층 신경망을 활용한 손상된 음성파일 복원 자동화)

  • Heo, Hee-Soo;So, Byung-Min;Yang, IL-Ho;Yoon, Sung-Hyun;Yu, Ha-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.36 no.2
    • /
    • pp.136-143
    • /
    • 2017
  • In this paper, we propose a method for restoring damaged audio files using deep neural network. It is different from the conventional file carving based restoration. The purpose of our method is to infer lost information which can not be restored by existing techniques such as the file carving. We have devised methods that can automate the tasks which are essential for the restoring but are inappropriate for humans. As a result of this study it has been shown that it is possible to restore the damaged files, which the conventional file carving method could not, by using tasks such as speech or nonspeech decision and speech encoder recognizer using a deep neural network.

A New Vessel Path Prediction Method Based on Anticipation of Acceleration of Vessel (가속도 예측 기반 새로운 선박 이동 경로 예측 방법)

  • Kim, Jonghee;Jung, Chanho;Kang, Dokeun;Lee, Chang Jin
    • Journal of IKEEE
    • /
    • v.24 no.4
    • /
    • pp.1176-1179
    • /
    • 2020
  • Vessel path prediction methods generally predict the latitude and longitude of a future location directly. However, in the case of direct prediction, errors could be large since the possible output range is too broad. In addition, error accumulation could occur since recurrent neural networks-based methods employ previous predicted data to forecast future data. In this paper, we propose a vessel path prediction method that does not directly predict the longitude and latitude. Instead, the proposed method predicts the acceleration of the vessel. Then the acceleration is employed to generate the velocity and direction, and the values decide the longitude and latitude of the future location. In the experiment, we show that the proposed method makes smaller errors than the direct prediction method, while both methods employ the same model.

A Study on Deep Learning Model for Discrimination of Illegal Financial Advertisements on the Internet

  • Kil-Sang Yoo; Jin-Hee Jang;Seong-Ju Kim;Kwang-Yong Gim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.8
    • /
    • pp.21-30
    • /
    • 2023
  • The study proposes a model that utilizes Python-based deep learning text classification techniques to detect the legality of illegal financial advertising posts on the internet. These posts aim to promote unlawful financial activities, including the trading of bank accounts, credit card fraud, cashing out through mobile payments, and the sale of personal credit information. Despite the efforts of financial regulatory authorities, the prevalence of illegal financial activities persists. By applying this proposed model, the intention is to aid in identifying and detecting illicit content in internet-based illegal financial advertisining, thus contributing to the ongoing efforts to combat such activities. The study utilizes convolutional neural networks(CNN) and recurrent neural networks(RNN, LSTM, GRU), which are commonly used text classification techniques. The raw data for the model is based on manually confirmed regulatory judgments. By adjusting the hyperparameters of the Korean natural language processing and deep learning models, the study has achieved an optimized model with the best performance. This research holds significant meaning as it presents a deep learning model for discerning internet illegal financial advertising, which has not been previously explored. Additionally, with an accuracy range of 91.3% to 93.4% in a deep learning model, there is a hopeful anticipation for the practical application of this model in the task of detecting illicit financial advertisements, ultimately contributing to the eradication of such unlawful financial advertisements.

Feasibility of Deep Learning Algorithms for Binary Classification Problems (이진 분류문제에서의 딥러닝 알고리즘의 활용 가능성 평가)

  • Kim, Kitae;Lee, Bomi;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.1
    • /
    • pp.95-108
    • /
    • 2017
  • Recently, AlphaGo which is Bakuk (Go) artificial intelligence program by Google DeepMind, had a huge victory against Lee Sedol. Many people thought that machines would not be able to win a man in Go games because the number of paths to make a one move is more than the number of atoms in the universe unlike chess, but the result was the opposite to what people predicted. After the match, artificial intelligence technology was focused as a core technology of the fourth industrial revolution and attracted attentions from various application domains. Especially, deep learning technique have been attracted as a core artificial intelligence technology used in the AlphaGo algorithm. The deep learning technique is already being applied to many problems. Especially, it shows good performance in image recognition field. In addition, it shows good performance in high dimensional data area such as voice, image and natural language, which was difficult to get good performance using existing machine learning techniques. However, in contrast, it is difficult to find deep leaning researches on traditional business data and structured data analysis. In this study, we tried to find out whether the deep learning techniques have been studied so far can be used not only for the recognition of high dimensional data but also for the binary classification problem of traditional business data analysis such as customer churn analysis, marketing response prediction, and default prediction. And we compare the performance of the deep learning techniques with that of traditional artificial neural network models. The experimental data in the paper is the telemarketing response data of a bank in Portugal. It has input variables such as age, occupation, loan status, and the number of previous telemarketing and has a binary target variable that records whether the customer intends to open an account or not. In this study, to evaluate the possibility of utilization of deep learning algorithms and techniques in binary classification problem, we compared the performance of various models using CNN, LSTM algorithm and dropout, which are widely used algorithms and techniques in deep learning, with that of MLP models which is a traditional artificial neural network model. However, since all the network design alternatives can not be tested due to the nature of the artificial neural network, the experiment was conducted based on restricted settings on the number of hidden layers, the number of neurons in the hidden layer, the number of output data (filters), and the application conditions of the dropout technique. The F1 Score was used to evaluate the performance of models to show how well the models work to classify the interesting class instead of the overall accuracy. The detail methods for applying each deep learning technique in the experiment is as follows. The CNN algorithm is a method that reads adjacent values from a specific value and recognizes the features, but it does not matter how close the distance of each business data field is because each field is usually independent. In this experiment, we set the filter size of the CNN algorithm as the number of fields to learn the whole characteristics of the data at once, and added a hidden layer to make decision based on the additional features. For the model having two LSTM layers, the input direction of the second layer is put in reversed position with first layer in order to reduce the influence from the position of each field. In the case of the dropout technique, we set the neurons to disappear with a probability of 0.5 for each hidden layer. The experimental results show that the predicted model with the highest F1 score was the CNN model using the dropout technique, and the next best model was the MLP model with two hidden layers using the dropout technique. In this study, we were able to get some findings as the experiment had proceeded. First, models using dropout techniques have a slightly more conservative prediction than those without dropout techniques, and it generally shows better performance in classification. Second, CNN models show better classification performance than MLP models. This is interesting because it has shown good performance in binary classification problems which it rarely have been applied to, as well as in the fields where it's effectiveness has been proven. Third, the LSTM algorithm seems to be unsuitable for binary classification problems because the training time is too long compared to the performance improvement. From these results, we can confirm that some of the deep learning algorithms can be applied to solve business binary classification problems.