• 제목/요약/키워드: optimal learning

검색결과 1,186건 처리시간 0.024초

A Federated Multi-Task Learning Model Based on Adaptive Distributed Data Latent Correlation Analysis

  • Wu, Shengbin;Wang, Yibai
    • Journal of Information Processing Systems
    • /
    • 제17권3호
    • /
    • pp.441-452
    • /
    • 2021
  • Federated learning provides an efficient integrated model for distributed data, allowing the local training of different data. Meanwhile, the goal of multi-task learning is to simultaneously establish models for multiple related tasks, and to obtain the underlying main structure. However, traditional federated multi-task learning models not only have strict requirements for the data distribution, but also demand large amounts of calculation and have slow convergence, which hindered their promotion in many fields. In our work, we apply the rank constraint on weight vectors of the multi-task learning model to adaptively adjust the task's similarity learning, according to the distribution of federal node data. The proposed model has a general framework for solving optimal solutions, which can be used to deal with various data types. Experiments show that our model has achieved the best results in different dataset. Notably, our model can still obtain stable results in datasets with large distribution differences. In addition, compared with traditional federated multi-task learning models, our algorithm is able to converge on a local optimal solution within limited training iterations.

심실 조기 수축 비트 검출을 위한 딥러닝 기반의 최적 파라미터 검출 (Optimal Parameter Extraction based on Deep Learning for Premature Ventricular Contraction Detection)

  • 조익성;권혁숭
    • 한국정보통신학회논문지
    • /
    • 제23권12호
    • /
    • pp.1542-1550
    • /
    • 2019
  • 부정맥 분류를 위한 기존 연구들은 분류의 정확성을 높이기 위해 신경회로망(Artificial Neural Network), 퍼지(Fuzzy), 기계학습(Machine Learning) 등을 이용한 방법이 연구되어 왔다. 특히 딥러닝은 신경회로망의 문제인 은닉층 개수의 한계를 해결함으로 인해 오류 역전파 알고리즘을 이용한 부정맥 분류에 가장 많이 사용되고 있다. 딥러닝 모델을 심전도 신호에 적용하기 위해서는 적절한 모델선택과 파라미터를 최적에 가깝게 선택할 필요가 있다. 본 연구에서는 심실 조기 수축 비트 검출을 위한 딥러닝 기반의 최적 파라미터 검출 방법을 제안한다. 이를 위해 먼저 잡음을 제거한 ECG신호에서 R파를 검출하고 QRS와 RR간격 세그먼트를 추출하였다. 이후 딥러닝을 통한 지도학습 방법으로 가중치를 학습시키고 검증데이터로 모델을 평가하였다. 제안된 방법의 타당성 평가를 위해 MIT-BIH 부정맥 데이터베이스를 통해 각 파라미터에 따른 딥러닝 모델로 훈련 및 검증 정확도를 확인하였다. 성능 평가 결과 R파의 평균 검출 성능은 99.77%, PVC는 97.84의 평균 분류율을 나타내었다.

Self-Imitation Learning을 이용한 개선된 Deep Q-Network 알고리즘 (Improved Deep Q-Network Algorithm Using Self-Imitation Learning)

  • 선우영민;이원창
    • 전기전자학회논문지
    • /
    • 제25권4호
    • /
    • pp.644-649
    • /
    • 2021
  • Self-Imitation Learning은 간단한 비활성 정책 actor-critic 알고리즘으로써 에이전트가 과거의 좋은 경험을 활용하여 최적의 정책을 찾을 수 있도록 해준다. 그리고 actor-critic 구조를 갖는 강화학습 알고리즘에 결합되어 다양한 환경들에서 알고리즘의 상당한 개선을 보여주었다. 하지만 Self-Imitation Learning이 강화학습에 큰 도움을 준다고 하더라도 그 적용 분야는 actor-critic architecture를 가지는 강화학습 알고리즘으로 제한되어 있다. 본 논문에서 Self-Imitation Learning의 알고리즘을 가치 기반 강화학습 알고리즘인 DQN에 적용하는 방법을 제안하고, Self-Imitation Learning이 적용된 DQN 알고리즘의 학습을 다양한 환경에서 진행한다. 아울러 그 결과를 기존의 결과와 비교함으로써 Self-Imitation Leaning이 DQN에도 적용될 수 있으며 DQN의 성능을 개선할 수 있음을 보인다.

Actor-Critic Algorithm with Transition Cost Estimation

  • Sergey, Denisov;Lee, Jee-Hyong
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제16권4호
    • /
    • pp.270-275
    • /
    • 2016
  • We present an approach for acceleration actor-critic algorithm for reinforcement learning with continuous action space. Actor-critic algorithm has already proved its robustness to the infinitely large action spaces in various high dimensional environments. Despite that success, the main problem of the actor-critic algorithm remains the same-speed of convergence to the optimal policy. In high dimensional state and action space, a searching for the correct action in each state takes enormously long time. Therefore, in this paper we suggest a search accelerating function that allows to leverage speed of algorithm convergence and reach optimal policy faster. In our method, we assume that actions may have their own distribution of preference, that independent on the state. Since in the beginning of learning agent act randomly in the environment, it would be more efficient if actions were taken according to the some heuristic function. We demonstrate that heuristically-accelerated actor-critic algorithm learns optimal policy faster, using Educational Process Mining dataset with records of students' course learning process and their grades.

최적의 퍼지제어규칙을 얻기위한 퍼지학습법 (A Learning Algorithm for Optimal Fuzzy Control Rules)

  • 정병묵
    • 대한기계학회논문집A
    • /
    • 제20권2호
    • /
    • pp.399-407
    • /
    • 1996
  • A fuzzy learning algorithm to get the optimal fuzzy rules is presented in this paper. The algorithm introduces a reference model to generate a desired output and a performance index funtion instead of the performance index table. The performance index funtion is a cost function based on the error and error-rate between the reference and plant output. The cost function is minimized by a gradient method and the control input is also updated. In this case, the control rules which generate the desired response can be obtained by changing the portion of the error-rate in the cost funtion. In SISO(Single-Input Single- Output)plant, only by the learning delay, it is possible to experss the plant model and to get the desired control rules. In the long run, this algorithm gives us the good control rules with a minimal amount of prior informaiton about the environment.

다목적 전력 시스템 최적운용을 위한 S 모델 Automata의 적용 연구 (A study on the application of S model automata for multiple objective optimal operation of Power systems)

  • 이용선;이병하
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 1999년도 하계학술대회 논문집 C
    • /
    • pp.1279-1281
    • /
    • 1999
  • The learning automaton is an automaton to update systematically the strategy for enhancing the performance in response to the output results, and several schemes of learning automata have been presented. In this paper, S-model learning automata are applied to achieving a best compromise solution between an optimal solution for economic operation and an optimal solution for stable operation of the power system under the circumstance that the loads vary randomly. It is shown that learning automata are applied satisfactorily to the multiobjective optimization problem for obtaining the best tradeoff among the conflicting economy and stability objectives of power systems.

  • PDF

Transductive SVM을 위한 분지-한계 알고리즘 (A Branch-and-Bound Algorithm for Finding an Optimal Solution of Transductive Support Vector Machines)

  • 박찬규
    • 한국경영과학회지
    • /
    • 제31권2호
    • /
    • pp.69-85
    • /
    • 2006
  • Transductive Support Vector Machine(TSVM) is one of semi-supervised learning algorithms which exploit the domain structure of the whole data by considering labeled and unlabeled data together. Although it was proposed several years ago, there has been no efficient algorithm which can handle problems with more than hundreds of training examples. In this paper, we propose an efficient branch-and-bound algorithm which can solve large-scale TSVM problems with thousands of training examples. The proposed algorithm uses two bounding techniques: min-cut bound and reduced SVM bound. The min-cut bound is derived from a capacitated graph whose cuts represent a lower bound to the optimal objective function value of the dual problem. The reduced SVM bound is obtained by constructing the SVM problem with only labeled data. Experimental results show that the accuracy rate of TSVM can be significantly improved by learning from the optimal solution of TSVM, rather than an approximated solution.

Dynamic Action Space Handling Method for Reinforcement Learning Models

  • Woo, Sangchul;Sung, Yunsick
    • Journal of Information Processing Systems
    • /
    • 제16권5호
    • /
    • pp.1223-1230
    • /
    • 2020
  • Recently, extensive studies have been conducted to apply deep learning to reinforcement learning to solve the state-space problem. If the state-space problem was solved, reinforcement learning would become applicable in various fields. For example, users can utilize dance-tutorial systems to learn how to dance by watching and imitating a virtual instructor. The instructor can perform the optimal dance to the music, to which reinforcement learning is applied. In this study, we propose a method of reinforcement learning in which the action space is dynamically adjusted. Because actions that are not performed or are unlikely to be optimal are not learned, and the state space is not allocated, the learning time can be shortened, and the state space can be reduced. In an experiment, the proposed method shows results similar to those of traditional Q-learning even when the state space of the proposed method is reduced to approximately 0.33% of that of Q-learning. Consequently, the proposed method reduces the cost and time required for learning. Traditional Q-learning requires 6 million state spaces for learning 100,000 times. In contrast, the proposed method requires only 20,000 state spaces. A higher winning rate can be achieved in a shorter period of time by retrieving 20,000 state spaces instead of 6 million.

Development of Problem-Based Learning in an English-Mediated College Science Course: Design-Based Research on Four Semesters Instruction

  • LAHAYE, Rob;LEE, Sang-eun
    • Educational Technology International
    • /
    • 제19권2호
    • /
    • pp.229-254
    • /
    • 2018
  • Universities in Korea have driven universities' new attempts to adopt more learner-centered and active learning in English. Problem-based Learning (PBL) is one of the well-known constructive teaching and learning methodologies in higher education. Our research goal was to design and develop the optimal PBL practices for a college physics course taught in English to promote learning and course satisfaction. For four semesters, we have tried and adjusted PBL components, and looked at the trend of the exam scores and group work achievement in each semester. We found that the number of problems and the duration of problem solving are the critical factors that influence the effect of PBL in a college physics course taught in English by going through iterative implementation. The iterative process of applying, designing, and constructing PBL to physics classes was meaningful not only in that we have found the optimal PBL model for learning a college physics course, but also in that we have been reflecting on the continuous interaction with learners during the course.

Adaptive Fuzzy Neural Control of Unknown Nonlinear Systems Based on Rapid Learning Algorithm

  • Kim, Hye-Ryeong;Kim, Jae-Hun;Kim, Euntai;Park, Mignon
    • 한국지능시스템학회:학술대회논문집
    • /
    • 한국퍼지및지능시스템학회 2003년도 추계 학술대회 학술발표 논문집
    • /
    • pp.95-98
    • /
    • 2003
  • In this paper, an adaptive fuzzy neural control of unknown nonlinear systems based on the rapid learning algorithm is proposed for optimal parameterization. We combine the advantages of fuzzy control and neural network techniques to develop an adaptive fuzzy control system for updating nonlinear parameters of controller. The Fuzzy Neural Network(FNN), which is constructed by an equivalent four-layer connectionist network, is able to learn to control a process by updating the membership functions. The free parameters of the AFN controller are adjusted on-line according to the control law and adaptive law for the purpose of controlling the plant track a given trajectory and it's initial values are off-line preprocessing, In order to improve the convergence of the learning process, we propose a rapid learning algorithm which combines the error back-propagation algorithm with Aitken's $\delta$$\^$2/ algorithm. The heart of this approach ls to reduce the computational burden during the FNN learning process and to improve convergence speed. The simulation results for nonlinear plant demonstrate the control effectiveness of the proposed system for optimal parameterization.

  • PDF