• Title/Summary/Keyword: Multi-Model Training

Search Result 352, Processing Time 0.026 seconds

C-COMA: A Continual Reinforcement Learning Model for Dynamic Multiagent Environments (C-COMA: 동적 다중 에이전트 환경을 위한 지속적인 강화 학습 모델)

  • Jung, Kyueyeol;Kim, Incheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.4
    • /
    • pp.143-152
    • /
    • 2021
  • It is very important to learn behavioral policies that allow multiple agents to work together organically for common goals in various real-world applications. In this multi-agent reinforcement learning (MARL) environment, most existing studies have adopted centralized training with decentralized execution (CTDE) methods as in effect standard frameworks. However, this multi-agent reinforcement learning method is difficult to effectively cope with in a dynamic environment in which new environmental changes that are not experienced during training time may constantly occur in real life situations. In order to effectively cope with this dynamic environment, this paper proposes a novel multi-agent reinforcement learning system, C-COMA. C-COMA is a continual learning model that assumes actual situations from the beginning and continuously learns the cooperative behavior policies of agents without dividing the training time and execution time of the agents separately. In this paper, we demonstrate the effectiveness and excellence of the proposed model C-COMA by implementing a dynamic mini-game based on Starcraft II, a representative real-time strategy game, and conducting various experiments using this environment.

A Korean Multi-speaker Text-to-Speech System Using d-vector (d-vector를 이용한 한국어 다화자 TTS 시스템)

  • Kim, Kwang Hyeon;Kwon, Chul Hong
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.3
    • /
    • pp.469-475
    • /
    • 2022
  • To train the model of the deep learning-based single-speaker TTS system, a speech DB of tens of hours and a lot of training time are required. This is an inefficient method in terms of time and cost to train multi-speaker or personalized TTS models. The voice cloning method uses a speaker encoder model to make the TTS model of a new speaker. Through the trained speaker encoder model, a speaker embedding vector representing the timbre of the new speaker is created from the small speech data of the new speaker that is not used for training. In this paper, we propose a multi-speaker TTS system to which voice cloning is applied. The proposed TTS system consists of a speaker encoder, synthesizer and vocoder. The speaker encoder applies the d-vector technique used in the speaker recognition field. The timbre of the new speaker is expressed by adding the d-vector derived from the trained speaker encoder as an input to the synthesizer. It can be seen that the performance of the proposed TTS system is excellent from the experimental results derived by the MOS and timbre similarity listening tests.

Prediction of multipurpose dam inflow utilizing catchment attributes with LSTM and transformer models (유역정보 기반 Transformer및 LSTM을 활용한 다목적댐 일 단위 유입량 예측)

  • Kim, Hyung Ju;Song, Young Hoon;Chung, Eun Sung
    • Journal of Korea Water Resources Association
    • /
    • v.57 no.7
    • /
    • pp.437-449
    • /
    • 2024
  • Rainfall-runoff prediction studies using deep learning while considering catchment attributes have been gaining attention. In this study, we selected two models: the Transformer model, which is suitable for large-scale data training through the self-attention mechanism, and the LSTM-based multi-state-vector sequence-to-sequence (LSTM-MSV-S2S) model with an encoder-decoder structure. These models were constructed to incorporate catchment attributes and predict the inflow of 10 multi-purpose dam watersheds in South Korea. The experimental design consisted of three training methods: Single-basin Training (ST), Pretraining (PT), and Pretraining-Finetuning (PT-FT). The input data for the models included 10 selected watershed attributes along with meteorological data. The inflow prediction performance was compared based on the training methods. The results showed that the Transformer model outperformed the LSTM-MSV-S2S model when using the PT and PT-FT methods, with the PT-FT method yielding the highest performance. The LSTM-MSV-S2S model showed better performance than the Transformer when using the ST method; however, it showed lower performance when using the PT and PT-FT methods. Additionally, the embedding layer activation vectors and raw catchment attributes were used to cluster watersheds and analyze whether the models learned the similarities between them. The Transformer model demonstrated improved performance among watersheds with similar activation vectors, proving that utilizing information from other pre-trained watersheds enhances the prediction performance. This study compared the suitable models and training methods for each multi-purpose dam and highlighted the necessity of constructing deep learning models using PT and PT-FT methods for domestic watersheds. Furthermore, the results confirmed that the Transformer model outperforms the LSTM-MSV-S2S model when applying PT and PT-FT methods.

Transformer-based transfer learning and multi-task learning for improving the performance of speech emotion recognition (음성감정인식 성능 향상을 위한 트랜스포머 기반 전이학습 및 다중작업학습)

  • Park, Sunchan;Kim, Hyung Soon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.5
    • /
    • pp.515-522
    • /
    • 2021
  • It is hard to prepare sufficient training data for speech emotion recognition due to the difficulty of emotion labeling. In this paper, we apply transfer learning with large-scale training data for speech recognition on a transformer-based model to improve the performance of speech emotion recognition. In addition, we propose a method to utilize context information without decoding by multi-task learning with speech recognition. According to the speech emotion recognition experiments using the IEMOCAP dataset, our model achieves a weighted accuracy of 70.6 % and an unweighted accuracy of 71.6 %, which shows that the proposed method is effective in improving the performance of speech emotion recognition.

The Multi-door Courthouse: Origin, Extension, and Case Studies (멀티도어코트하우스제도: 기원, 확장과 사례분석)

  • Chung, Yongkyun
    • Journal of Arbitration Studies
    • /
    • v.28 no.2
    • /
    • pp.3-43
    • /
    • 2018
  • The emergence of a multi-door courthouse is related with a couple of reasons as follows: First, a multi-door courthouse was originally initiated by the United States government that increasingly became impatient with the pace and cost of protracted litigation clogging the courts. Second, dockets of courts are overcrowded with legal suits, making it difficult for judges to handle those legal suits in time and causing delays in responding to citizens' complaints. Third, litigation is not suitable for the disputant that has an ongoing relationship with the other party. In this case, even if winning is achieved in the short run, it may not be all that was hoped for in the long run. Fourth, international organizations such as the World Bank, UNDP, and Asia Development Bank urge to provide an increased access to women, residents, and the poor in local communities. The generic model of a multi-door courthouse consists of three stages: The first stage includes a center offering intake services, along with an array of dispute resolution services under one roof. At the second stage, the screening unit at the center would diagnose citizen disputes, then refer the disputants to the appropriate door for handling the case. At the third stage, the multi-door courthouse provides diverse kinds of dispute resolution programs such as mediation, arbitration, mediation-arbitration (med-arb), litigation, and early neutral evaluation. This study suggests the extended model of multi-door courthouse comprised of five layers: intake process, diagnosis and door-selection process, neutral-selection process, implementation process of dispute resolution, and process of training and education. One of the major characteristics of extended multi-door courthouse model is the detailed specification of individual department corresponding to each process within a multi-door courthouse. The intake department takes care of the intake process. The screening department plays the role of screening disputes, diagnosing the nature of disputes, and determining a suitable door to handle disputes. The human resources department manages experts through the construction and management of the data base of mediators, arbitrators, and judges. The administration bureau manages the implementation of each process of dispute resolution. The education and training department builds long-term planning to procure neutrals and experts dealing with various kinds of disputes within a multi-door courthouse. For this purpose, it is necessary to establish networks among courts, law schools, and associations of scholars in order to facilitate the supply of manpower in ADR neutrals, as well as judges in the long run. This study also provides six case studies of multi-door courthouses across continents in order to grasp the worldwide picture and wide spread phenomena of multi-door courthouse. For this purpose, the United States and Latin American countries including Argentina and Brazil, Middle Eastern countries, and Southeast Asian countries (such as Malaysia and Myanmar), Australia, and Nigeria were chosen. It was found that three kinds of patterns are discernible during the evolution of a multi-door courthouse model. First, the federal courts of the United States, land and environment court in Australia, and Lagos multi-door courthouse in Nigeria may maintain the prototype of a multi-door courthouse model. Second, the judicial systems in Latin American countries tend to show heterogenous patterns in terms of the adaptation of a multi-door courthouse model to their own environments. Some court systems of Latin American countries including those of Argentina and Brazil resemble the generic model of a multi-door courthouse, while other countries show their distinctive pattern of judicial system and ADR systems. Third, it was found that legal pluralism is prevalent in Middle Eastern countries and Southeast Asian countries. For example, Middle Eastern countries such as Saudi Arabia have developed various kinds of dispute resolution methods, such as sulh (mediation), tahkim (arbitration), and med-arb for many centuries, since they have been situated at the state of tribe or clan instead of nation. Accordingly, they have no unified code within the territory. In case of Southeast Asian countries such as Myanmar and Malaysia, they have preserved a strong tradition of customary laws such as Dhammthat in Burma, and Shriah and the Islamic law in Malaysia for a long time. On the other hand, they incorporated a common law system into a secular judicial system in Myanmar and Malaysia during the colonial period. Finally, this article proposes a couple of factors to strengthen or weaken a multi-door courthouse model. The first factor to strengthen a multi-door courthouse model is the maintenance of flexibility and core value of alternative dispute resolution. We also find that fund raising is important to build and maintain the multi-door courthouse model, reflecting the fact that there has been a competition surrounding the allocation of funds within the judicial system.

The Method for Generating Recommended Candidates through Prediction of Multi-Criteria Ratings Using CNN-BiLSTM

  • Kim, Jinah;Park, Junhee;Shin, Minchan;Lee, Jihoon;Moon, Nammee
    • Journal of Information Processing Systems
    • /
    • v.17 no.4
    • /
    • pp.707-720
    • /
    • 2021
  • To improve the accuracy of the recommendation system, multi-criteria recommendation systems have been widely researched. However, it is highly complicated to extract the preferred features of users and items from the data. To this end, subjective indicators, which indicate a user's priorities for personalized recommendations, should be derived. In this study, we propose a method for generating recommendation candidates by predicting multi-criteria ratings from reviews and using them to derive user priorities. Using a deep learning model based on convolutional neural network (CNN) and bidirectional long short-term memory (BiLSTM), multi-criteria prediction ratings were derived from reviews. These ratings were then aggregated to form a linear regression model to predict the overall rating. This model not only predicts the overall rating but also uses the training weights from the layers of the model as the user's priority. Based on this, a new score matrix for recommendation is derived by calculating the similarity between the user and the item according to the criteria, and an item suitable for the user is proposed. The experiment was conducted by collecting the actual "TripAdvisor" dataset. For performance evaluation, the proposed method was compared with a general recommendation system based on singular value decomposition. The results of the experiments demonstrate the high performance of the proposed method.

Training Performance Analysis of Semantic Segmentation Deep Learning Model by Progressive Combining Multi-modal Spatial Information Datasets (다중 공간정보 데이터의 점진적 조합에 의한 의미적 분류 딥러닝 모델 학습 성능 분석)

  • Lee, Dae-Geon;Shin, Young-Ha;Lee, Dong-Cheon
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.40 no.2
    • /
    • pp.91-108
    • /
    • 2022
  • In most cases, optical images have been used as training data of DL (Deep Learning) models for object detection, recognition, identification, classification, semantic segmentation, and instance segmentation. However, properties of 3D objects in the real-world could not be fully explored with 2D images. One of the major sources of the 3D geospatial information is DSM (Digital Surface Model). In this matter, characteristic information derived from DSM would be effective to analyze 3D terrain features. Especially, man-made objects such as buildings having geometrically unique shape could be described by geometric elements that are obtained from 3D geospatial data. The background and motivation of this paper were drawn from concept of the intrinsic image that is involved in high-level visual information processing. This paper aims to extract buildings after classifying terrain features by training DL model with DSM-derived information including slope, aspect, and SRI (Shaded Relief Image). The experiments were carried out using DSM and label dataset provided by ISPRS (International Society for Photogrammetry and Remote Sensing) for CNN-based SegNet model. In particular, experiments focus on combining multi-source information to improve training performance and synergistic effect of the DL model. The results demonstrate that buildings were effectively classified and extracted by the proposed approach.

Speech Recognition using MSHMM based on Fuzzy Concept

  • Ann, Tae-Ock
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.2E
    • /
    • pp.55-61
    • /
    • 1997
  • This paper proposes a MSHMM(Multi-Section Hidden Markov Model) recognition method based on Fuzzy Concept, as a method on the speech recognition of speaker-independent. In this recognition method, training data are divided into several section and multi-observation sequences given proper probabilities by fuzzy rule according to order of short distance from MSVQ codebook per each section are obtained. Thereafter, the HMM per each section using this multi-observation sequences is generated, and in case of recognition, a word that has the most highest probability is selected as a recognized word. In this paper, other experiments to compare with the results of these experiments are implemented by the various conventional recognition methods(DP, MSVQ, DMS, general HMM) under the same data. Through results of all-round experiment, it is proved that the proposed MSHMM based on fuzzy concept is superior to DP method, MSVQ method, DMS model and general HMM model in recognition rate and computational time, and does not decreases recognition rate as 92.91% in spite of increment of speaker number.

  • PDF

The Use of MSVM and HMM for Sentence Alignment

  • Fattah, Mohamed Abdel
    • Journal of Information Processing Systems
    • /
    • v.8 no.2
    • /
    • pp.301-314
    • /
    • 2012
  • In this paper, two new approaches to align English-Arabic sentences in bilingual parallel corpora based on the Multi-Class Support Vector Machine (MSVM) and the Hidden Markov Model (HMM) classifiers are presented. A feature vector is extracted from the text pair that is under consideration. This vector contains text features such as length, punctuation score, and cognate score values. A set of manually prepared training data was assigned to train the Multi-Class Support Vector Machine and Hidden Markov Model. Another set of data was used for testing. The results of the MSVM and HMM outperform the results of the length based approach. Moreover these new approaches are valid for any language pairs and are quite flexible since the feature vector may contain less, more, or different features, such as a lexical matching feature and Hanzi characters in Japanese-Chinese texts, than the ones used in the current research.

ANN-based Evaluation Model of Combat Situation to predict the Progress of Simulated Combat Training

  • Yoon, Soungwoong;Lee, Sang-Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.7
    • /
    • pp.31-37
    • /
    • 2017
  • There are lots of combined battlefield elements which complete the war. It looks problematic when collecting and analyzing these elements and then predicting the situation of war. Commander's experience and military power assessment have widely been used to come up with these problems, then simulated combat training program recently supplements the war-game models through recording real-time simulated combat data. Nevertheless, there are challenges to assess winning factors of combat. In this paper, we characterize the combat element (ce) by clustering simulated combat data, and then suggest multi-layered artificial neural network (ANN) model, which can comprehend non-linear, cross-connected effects among ces to assess mission completion degree (MCD). Through our ANN model, we have the chance of analyzing and predicting winning factors. Experimental results show that our ANN model can explain MCDs through networking ces which overperform multiple linear regression model. Moreover, sensitivity analysis of ces will be the basis of predicting combat situation.