• Title/Summary/Keyword: SampleRNN

Search Result 6, Processing Time 0.018 seconds

A Deep Learning System for Emotional Cat Sound Classification and Generation (감정별 고양이 소리 분류 및 생성 딥러닝 시스템)

  • Joo Yong Shim;SungKi Lim;Jong-Kook Kim
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.10
    • /
    • pp.492-496
    • /
    • 2024
  • Cats are known to express their emotions through a variety of vocalizations during interactions. These sounds reflect their emotional states, making the understanding and interpretation of these sounds crucial for more effective communication. Recent advancements in artificial intelligence has introduced research related to emotion recognition, particularly focusing on the analysis of voice data using deep learning models. Building on this background, the study aims to develop a deep learning system that classifies and generates cat sounds based on their emotional content. The classification model is trained to accurately categorize cat vocalizations by emotion. The sound generation model, which uses deep learning based models such as SampleRNN, is designed to produce cat sounds that reflect specific emotional states. The study finally proposes an integrated system that takes recorded cat vocalizations, classify them by emotion, and generate cat sounds based on user requirements.

Deep Learning Music genre automatic classification voting system using Softmax (소프트맥스를 이용한 딥러닝 음악장르 자동구분 투표 시스템)

  • Bae, June;Kim, Jangyoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.1
    • /
    • pp.27-32
    • /
    • 2019
  • Research that implements the classification process through Deep Learning algorithm, one of the outstanding human abilities, includes a unimodal model, a multi-modal model, and a multi-modal method using music videos. In this study, the results were better by suggesting a system to analyze each song's spectrum into short samples and vote for the results. Among Deep Learning algorithms, CNN showed superior performance in the category of music genre compared to RNN, and improved performance when CNN and RNN were applied together. The system of voting for each CNN result by Deep Learning a short sample of music showed better results than the previous model and the model with Softmax layer added to the model performed best. The need for the explosive growth of digital media and the automatic classification of music genres in numerous streaming services is increasing. Future research will need to reduce the proportion of undifferentiated songs and develop algorithms for the last category classification of undivided songs.

Multi-objective optimization of tapered tubes for crashworthiness by surrogate methodologies

  • Asgari, Masoud;Babaee, Alireza;Jamshidi, Mohammadamin
    • Steel and Composite Structures
    • /
    • v.27 no.4
    • /
    • pp.427-438
    • /
    • 2018
  • In this paper, the single and multi-objective optimization of thin-walled conical tubes with different types of indentations under axial impact has been investigated using surrogate models called metamodels. The geometry of tapered thin-walled tubes has been studied in order to achieve maximum specific energy absorption (SEA) and minimum peak crushing force (PCF). The height, radius, thickness, tapered angle of the tube, and the radius of indentation have been considered as design variables. Based on the design of experiments (DOE) method, the generated sample points are computed using the explicit finite element code. Different surrogate models including Kriging, Feed Forward Neural Network (FNN), Radial Basis Neural Network (RNN), and Response Surface Modelling (RSM) comprised to evaluate the appropriation of such models. The comparison study between surrogate models and the exploration of indentation shapes have been provided. The obtained results show that the RNN method has the minimum mean squared error (MSE) in training points compared to the other methods. Meanwhile, optimization based on surrogate models with lower values of MSE does not provide optimum results. The RNN method demonstrates a lower crashworthiness performance (with a lower value of 125.7% for SEA and a higher value of 56.8% for PCF) in comparison to RSM with an error order of $10^{-3}$. The SEA values can be increased by 17.6% and PCF values can be decreased by 24.63% by different types of indentation. In a specific geometry, higher SEA and lower PCF require triangular and circular shapes of indentation, respectively.

Application of ML algorithms to predict the effective fracture toughness of several types of concret

  • Ibrahim Albaijan;Hanan Samadi;Arsalan Mahmoodzadeh;Hawkar Hashim Ibrahim;Nejib Ghazouani
    • Computers and Concrete
    • /
    • v.34 no.2
    • /
    • pp.247-265
    • /
    • 2024
  • Measuring the fracture toughness of concrete in laboratory settings is challenging due to various factors, such as complex sample preparation procedures, the requirement for precise instruments, potential sample failure, and the brittleness of the samples. Therefore, there is an urgent need to develop innovative and more effective tools to overcome these limitations. Supervised learning methods offer promising solutions. This study introduces seven machine learning algorithms for predicting concrete's effective fracture toughness (K-eff). The models were trained using 560 datasets obtained from the central straight notched Brazilian disc (CSNBD) test. The concrete samples used in the experiments contained micro silica and powdered stone, which are commonly used additives in the construction industry. The study considered six input parameters that affect concrete's K-eff, including concrete type, sample diameter, sample thickness, crack length, force, and angle of initial crack. All the algorithms demonstrated high accuracy on both the training and testing datasets, with R2 values ranging from 0.9456 to 0.9999 and root mean squared error (RMSE) values ranging from 0.000004 to 0.009287. After evaluating their performance, the gated recurrent unit (GRU) algorithm showed the highest predictive accuracy. The ranking of the applied models, from highest to lowest performance in predicting the K-eff of concrete, was as follows: GRU, LSTM, RNN, SFL, ELM, LSSVM, and GEP. In conclusion, it is recommended to use supervised learning models, specifically GRU, for precise estimation of concrete's K-eff. This approach allows engineers to save significant time and costs associated with the CSNBD test. This research contributes to the field by introducing a reliable tool for accurately predicting the K-eff of concrete, enabling efficient decision-making in various engineering applications.

LSTM-based Deep Learning for Time Series Forecasting: The Case of Corporate Credit Score Prediction (시계열 예측을 위한 LSTM 기반 딥러닝: 기업 신용평점 예측 사례)

  • Lee, Hyun-Sang;Oh, Sehwan
    • The Journal of Information Systems
    • /
    • v.29 no.1
    • /
    • pp.241-265
    • /
    • 2020
  • Purpose Various machine learning techniques are used to implement for predicting corporate credit. However, previous research doesn't utilize time series input features and has a limited prediction timing. Furthermore, in the case of corporate bond credit rating forecast, corporate sample is limited because only large companies are selected for corporate bond credit rating. To address limitations of prior research, this study attempts to implement a predictive model with more sample companies, which can adjust the forecasting point at the present time by using the credit score information and corporate information in time series. Design/methodology/approach To implement this forecasting model, this study uses the sample of 2,191 companies with KIS credit scores for 18 years from 2000 to 2017. For improving the performance of the predictive model, various financial and non-financial features are applied as input variables in a time series through a sliding window technique. In addition, this research also tests various machine learning techniques that were traditionally used to increase the validity of analysis results, and the deep learning technique that is being actively researched of late. Findings RNN-based stateful LSTM model shows good performance in credit rating prediction. By extending the forecasting time point, we find how the performance of the predictive model changes over time and evaluate the feature groups in the short and long terms. In comparison with other studies, the results of 5 classification prediction through label reclassification show good performance relatively. In addition, about 90% accuracy is found in the bad credit forecasts.

A Method for Generating Malware Countermeasure Samples Based on Pixel Attention Mechanism

  • Xiangyu Ma;Yuntao Zhao;Yongxin Feng;Yutao Hu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.2
    • /
    • pp.456-477
    • /
    • 2024
  • With information technology's rapid development, the Internet faces serious security problems. Studies have shown that malware has become a primary means of attacking the Internet. Therefore, adversarial samples have become a vital breakthrough point for studying malware. By studying adversarial samples, we can gain insights into the behavior and characteristics of malware, evaluate the performance of existing detectors in the face of deceptive samples, and help to discover vulnerabilities and improve detection methods for better performance. However, existing adversarial sample generation methods still need help regarding escape effectiveness and mobility. For instance, researchers have attempted to incorporate perturbation methods like Fast Gradient Sign Method (FGSM), Projected Gradient Descent (PGD), and others into adversarial samples to obfuscate detectors. However, these methods are only effective in specific environments and yield limited evasion effectiveness. To solve the above problems, this paper proposes a malware adversarial sample generation method (PixGAN) based on the pixel attention mechanism, which aims to improve adversarial samples' escape effect and mobility. The method transforms malware into grey-scale images and introduces the pixel attention mechanism in the Deep Convolution Generative Adversarial Networks (DCGAN) model to weigh the critical pixels in the grey-scale map, which improves the modeling ability of the generator and discriminator, thus enhancing the escape effect and mobility of the adversarial samples. The escape rate (ASR) is used as an evaluation index of the quality of the adversarial samples. The experimental results show that the adversarial samples generated by PixGAN achieve escape rates of 97%, 94%, 35%, 39%, and 43% on the Random Forest (RF), Support Vector Machine (SVM), Convolutional Neural Network (CNN), Convolutional Neural Network and Recurrent Neural Network (CNN_RNN), and Convolutional Neural Network and Long Short Term Memory (CNN_LSTM) algorithmic detectors, respectively.