• 제목/요약/키워드: SampleRNN

검색결과 6건 처리시간 0.017초

감정별 고양이 소리 분류 및 생성 딥러닝 시스템 (A Deep Learning System for Emotional Cat Sound Classification and Generation)

  • 심주용;임성기;김종국
    • 정보처리학회 논문지
    • /
    • 제13권10호
    • /
    • pp.492-496
    • /
    • 2024
  • 반려동물, 특히 고양이는 인간과의 상호작용에서 다양한 소리를 통해 감정을 표현하는 것으로 알려져 있다. 고양이의 소리는 그들이 느끼는 감정 상태를 반영하며, 이를 이해하고 해석하는 것은 반려동물과의 소통을 더욱 원활하게 하는 데 중요한 요소이다. 최근 인공지능 기술의 발전으로 감정 인식과 관련된 연구가 활발히 진행되고 있으며, 특히 딥러닝 모델을 활용한 음성 데이터 분석이 주목받고 있다. 본 연구는 이러한 배경에서 출발하여, 고양이의 소리를 감정별로 분류하고 생성하는 딥러닝 시스템을 개발하는 것을 목표로 한다. 분류 모델은 고양이 소리를 감정별로 정확하게 분류하기 위해 학습되며, 소리 생성 모델은 SampleRNN과 같은 딥러닝 기법을 활용하여 특정 감정을 표현하는 고양이 소리를 생성할 수 있도록 설계된다. 마지막으로, 학습된 두 모델을 통합하여 고양이 소리를 녹음하고 이를 감정별로 분류한 결과 및 사용자의 요구에 따른 고양이 소리를 생성하여 제공할 수 있는 시스템을 제안한다.

소프트맥스를 이용한 딥러닝 음악장르 자동구분 투표 시스템 (Deep Learning Music genre automatic classification voting system using Softmax)

  • 배준;김장영
    • 한국정보통신학회논문지
    • /
    • 제23권1호
    • /
    • pp.27-32
    • /
    • 2019
  • 인간이 가진 뛰어난 능력 중의 하나인 곡 분류 과정을 딥러닝 알고리즘을 통해 구현하는 연구는 단일데이터를 이용한 유니모달 모델, 멀티모달 모델, 뮤직비디오를 이용한 멀티모달 방식 등이 있다. 이 연구에서는 곡의 스펙트로그램을 짧은 샘플들로 분할하여 각각을 CNN으로 분석한 뒤 그 결과를 투표하는 시스템을 제안하여 더 좋은 결과를 얻었다. 딥러닝 알고리즘 중 CNN이 RNN에 비해 음악 장르 구분에 있어 우수한 성능을 보였으며 CNN과 RNN을 같이 적용했을 때 성능이 좋아짐을 알 수 있었다. 음악샘플을 나누어 각각의 CNN 결과를 투표하는 시스템이 이전 모델에 비해 좋은 결과를 나타내었고 이 모델에 Softmax 레이어를 추가한 모델이 가장 좋은 성능을 보였다. 디지털 미디어의 폭발적인 성장과 수많은 스트리밍 서비스 속에서 음악장르의 자동분류에 대한 필요는 점점 증가하고 있는 추세이다. 향후 연구에서는 미분류 곡의 비율을 낮추고 최종적으로 미분류된 곡들의 장르구분에 대한 알고리즘을 개발할 필요가 있을 것이다.

Multi-objective optimization of tapered tubes for crashworthiness by surrogate methodologies

  • Asgari, Masoud;Babaee, Alireza;Jamshidi, Mohammadamin
    • Steel and Composite Structures
    • /
    • 제27권4호
    • /
    • pp.427-438
    • /
    • 2018
  • In this paper, the single and multi-objective optimization of thin-walled conical tubes with different types of indentations under axial impact has been investigated using surrogate models called metamodels. The geometry of tapered thin-walled tubes has been studied in order to achieve maximum specific energy absorption (SEA) and minimum peak crushing force (PCF). The height, radius, thickness, tapered angle of the tube, and the radius of indentation have been considered as design variables. Based on the design of experiments (DOE) method, the generated sample points are computed using the explicit finite element code. Different surrogate models including Kriging, Feed Forward Neural Network (FNN), Radial Basis Neural Network (RNN), and Response Surface Modelling (RSM) comprised to evaluate the appropriation of such models. The comparison study between surrogate models and the exploration of indentation shapes have been provided. The obtained results show that the RNN method has the minimum mean squared error (MSE) in training points compared to the other methods. Meanwhile, optimization based on surrogate models with lower values of MSE does not provide optimum results. The RNN method demonstrates a lower crashworthiness performance (with a lower value of 125.7% for SEA and a higher value of 56.8% for PCF) in comparison to RSM with an error order of $10^{-3}$. The SEA values can be increased by 17.6% and PCF values can be decreased by 24.63% by different types of indentation. In a specific geometry, higher SEA and lower PCF require triangular and circular shapes of indentation, respectively.

Application of ML algorithms to predict the effective fracture toughness of several types of concret

  • Ibrahim Albaijan;Hanan Samadi;Arsalan Mahmoodzadeh;Hawkar Hashim Ibrahim;Nejib Ghazouani
    • Computers and Concrete
    • /
    • 제34권2호
    • /
    • pp.247-265
    • /
    • 2024
  • Measuring the fracture toughness of concrete in laboratory settings is challenging due to various factors, such as complex sample preparation procedures, the requirement for precise instruments, potential sample failure, and the brittleness of the samples. Therefore, there is an urgent need to develop innovative and more effective tools to overcome these limitations. Supervised learning methods offer promising solutions. This study introduces seven machine learning algorithms for predicting concrete's effective fracture toughness (K-eff). The models were trained using 560 datasets obtained from the central straight notched Brazilian disc (CSNBD) test. The concrete samples used in the experiments contained micro silica and powdered stone, which are commonly used additives in the construction industry. The study considered six input parameters that affect concrete's K-eff, including concrete type, sample diameter, sample thickness, crack length, force, and angle of initial crack. All the algorithms demonstrated high accuracy on both the training and testing datasets, with R2 values ranging from 0.9456 to 0.9999 and root mean squared error (RMSE) values ranging from 0.000004 to 0.009287. After evaluating their performance, the gated recurrent unit (GRU) algorithm showed the highest predictive accuracy. The ranking of the applied models, from highest to lowest performance in predicting the K-eff of concrete, was as follows: GRU, LSTM, RNN, SFL, ELM, LSSVM, and GEP. In conclusion, it is recommended to use supervised learning models, specifically GRU, for precise estimation of concrete's K-eff. This approach allows engineers to save significant time and costs associated with the CSNBD test. This research contributes to the field by introducing a reliable tool for accurately predicting the K-eff of concrete, enabling efficient decision-making in various engineering applications.

시계열 예측을 위한 LSTM 기반 딥러닝: 기업 신용평점 예측 사례 (LSTM-based Deep Learning for Time Series Forecasting: The Case of Corporate Credit Score Prediction)

  • 이현상;오세환
    • 한국정보시스템학회지:정보시스템연구
    • /
    • 제29권1호
    • /
    • pp.241-265
    • /
    • 2020
  • Purpose Various machine learning techniques are used to implement for predicting corporate credit. However, previous research doesn't utilize time series input features and has a limited prediction timing. Furthermore, in the case of corporate bond credit rating forecast, corporate sample is limited because only large companies are selected for corporate bond credit rating. To address limitations of prior research, this study attempts to implement a predictive model with more sample companies, which can adjust the forecasting point at the present time by using the credit score information and corporate information in time series. Design/methodology/approach To implement this forecasting model, this study uses the sample of 2,191 companies with KIS credit scores for 18 years from 2000 to 2017. For improving the performance of the predictive model, various financial and non-financial features are applied as input variables in a time series through a sliding window technique. In addition, this research also tests various machine learning techniques that were traditionally used to increase the validity of analysis results, and the deep learning technique that is being actively researched of late. Findings RNN-based stateful LSTM model shows good performance in credit rating prediction. By extending the forecasting time point, we find how the performance of the predictive model changes over time and evaluate the feature groups in the short and long terms. In comparison with other studies, the results of 5 classification prediction through label reclassification show good performance relatively. In addition, about 90% accuracy is found in the bad credit forecasts.

A Method for Generating Malware Countermeasure Samples Based on Pixel Attention Mechanism

  • Xiangyu Ma;Yuntao Zhao;Yongxin Feng;Yutao Hu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제18권2호
    • /
    • pp.456-477
    • /
    • 2024
  • With information technology's rapid development, the Internet faces serious security problems. Studies have shown that malware has become a primary means of attacking the Internet. Therefore, adversarial samples have become a vital breakthrough point for studying malware. By studying adversarial samples, we can gain insights into the behavior and characteristics of malware, evaluate the performance of existing detectors in the face of deceptive samples, and help to discover vulnerabilities and improve detection methods for better performance. However, existing adversarial sample generation methods still need help regarding escape effectiveness and mobility. For instance, researchers have attempted to incorporate perturbation methods like Fast Gradient Sign Method (FGSM), Projected Gradient Descent (PGD), and others into adversarial samples to obfuscate detectors. However, these methods are only effective in specific environments and yield limited evasion effectiveness. To solve the above problems, this paper proposes a malware adversarial sample generation method (PixGAN) based on the pixel attention mechanism, which aims to improve adversarial samples' escape effect and mobility. The method transforms malware into grey-scale images and introduces the pixel attention mechanism in the Deep Convolution Generative Adversarial Networks (DCGAN) model to weigh the critical pixels in the grey-scale map, which improves the modeling ability of the generator and discriminator, thus enhancing the escape effect and mobility of the adversarial samples. The escape rate (ASR) is used as an evaluation index of the quality of the adversarial samples. The experimental results show that the adversarial samples generated by PixGAN achieve escape rates of 97%, 94%, 35%, 39%, and 43% on the Random Forest (RF), Support Vector Machine (SVM), Convolutional Neural Network (CNN), Convolutional Neural Network and Recurrent Neural Network (CNN_RNN), and Convolutional Neural Network and Long Short Term Memory (CNN_LSTM) algorithmic detectors, respectively.