Search | Korea Science

Design of Reinforcement Learning Controller with Self-Organizing Map (자기 조직화 맵을 이용한 강화학습 제어기 설계)

이재강;김일환
- The Transactions of the Korean Institute of Electrical Engineers D
- /
- v.53 no.5
- /
- pp.353-360
- /
- 2004
This paper considers reinforcement learning control with the self-organizing map. Reinforcement learning uses the observable states of objective system and signals from interaction of the system and environment as input data. For fast learning in neural network training, it is necessary to reduce learning data. In this paper, we use the self-organizing map to partition the observable states. Partitioning states reduces the number of learning data which is used for training neural networks. And neural dynamic programming design method is used for the controller. For evaluating the designed reinforcement learning controller, an inverted pendulum on the cart system is simulated. The designed controller is composed of serial connection of self-organizing map and two Multi-layer Feed-Forward Neural Networks.
PDF KSCI

A Dynamic Programming Neural Network to find the Safety Distance of Industrial Field (산업 현장의 안전거리 계측을 위한 동적 계획 신경회로망)

Kim, Jong-Man;Kim, Won-Sub;Kim, Yeong-Min;Hwang, Jong-Sun;Park, Hyun-Chul
- Proceedings of the Korean Institute of Electrical and Electronic Material Engineers Conference
- /
- 2001.09a
- /
- pp.23-27
- /
- 2001
Making the safety situation from the various work system is very important in the industrial fields. The proposed neural network technique is the real titre computation method based theory of inter-node diffusion for searching the safety distances from the sudden appearance-objests during the work driving. The main steps of the distance computation using the theory of stereo vision like the eyes of man is following steps. One is the processing for finding the corresponding points of stereo images and the other is the interpolation processing of full image data from nonlinear image data of obejects. All of them request much memory space and titre. Therefore the most reliable neural-network algorithm is drived for real time recognition of obejects, which is composed of a dynamic programming algorithm based on sequence matching techniques. And the real time reconstruction of nonlinear image information is processed through several simulations. I-D LIPN hardware has been composed, and the real time reconstruction is verified through the various experiments.
PDF

Propagation Neural Networks for Real-time Recognition of Error Data (에라 정보의 실시간 인식을 위한 전파신경망)

Kim, Jong-Man;Hwang, Jong-Sun;Kim, Young-Min
- Proceedings of the Korean Institute of Electrical and Electronic Material Engineers Conference
- /
- 2001.11b
- /
- pp.46-51
- /
- 2001
For Fast Real-time Recognition of Nonlinear Error Data, a new Neural Network algorithm which recognized the map in real time is proposed. The proposed neural network technique is the real time computation method through the inter-node diffusion, In the network, a node corresponds to a state in the quantized input space. Each node is composed of a processing unit and fixed weights from its neighbor nodes as well as its input terminal. The most reliable algorithm derived for real time recognition of map, is a dynamic programming based algorithm based on sequence matching techniques that would process the data as it arrives and could therefore provide continuously updated neighbor information estimates. Through several simulation experiments, real time reconstruction of the nonlinear map information is processed,
PDF

Visual Servoing of Robot Manipulators using Pruned Recurrent Neural Networks (저차원화된 리커런트 뉴럴 네트워크를 이용한 비주얼 서보잉)

김대준;이동욱;심귀보
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 1997.11a
- /
- pp.259-262
- /
- 1997
This paper presents a visual servoing of RV-M2 robot manipulators to track and grasp moving object, using pruned dynamic recurrent neural networks(DRNN). The object is stationary in the robot work space and the robot is tracking and grasping the object by using CCD camera mounted on the end-effector. In order to optimize the structure of DRNN, we decide the node whether delete or add, by mutation probability, first in case of delete node, the node which have minimum sum of input weight is actually deleted, and then in case of add node, the weight is connected according to the number of case which added node can reach the other nodes. Using evolutionary programming(EP) that search the struture and weight of the DRNN, and evolution strategies(ES) which train the weight of neuron, we pruned the net structure of DRNN. We applied the DRNN to the Visual Servoing of a robot manipulators to control position and orientation of end-effector, and the validity and effectiveness of the pro osed control scheme will be verified by computer simulations.
PDF

Propagation Neural Networks for Real-time Recognition of Error Data (에라 정보의 실시간 인식을 위한 전파신경망)

김종만;황종선;김영민
- Proceedings of the Korean Institute of Electrical and Electronic Material Engineers Conference
- /
- 2001.11a
- /
- pp.46-51
- /
- 2001
For Fast Real-time Recognition of Nonlinear Error Data, a new Neural Network algorithm which recognized the map in real time is proposed. The proposed neural network technique is the real time computation method through the inter-node diffusion. In the network, a node corresponds to a state in the quantized input space. Each node is composed of a processing unit and fixed weights from its neighbor nodes as well as its input terminal. The most reliable algorithm derived for real time recognition of map, is a dynamic programming based algorithm based on sequence matching techniques that would process the data as it arrives and could therefore provide continuously updated neighbor information estimates. Through several simulation experiments, real time reconstruction of the nonlinear map information is processed.
PDF

Control of pH Neutralization Process using Simulation Based Dynamic Programming in Simulation and Experiment (ICCAS 2004)

Kim, Dong-Kyu;Lee, Kwang-Soon;Yang, Dae-Ryook
- 제어로봇시스템학회:학술대회논문집
- /
- 2004.08a
- /
- pp.620-626
- /
- 2004
For general nonlinear processes, it is difficult to control with a linear model-based control method and nonlinear controls are considered. Among the numerous approaches suggested, the most rigorous approach is to use dynamic optimization. Many general engineering problems like control, scheduling, planning etc. are expressed by functional optimization problem and most of them can be changed into dynamic programming (DP) problems. However the DP problems are used in just few cases because as the size of the problem grows, the dynamic programming approach is suffered from the burden of calculation which is called as 'curse of dimensionality'. In order to avoid this problem, the Neuro-Dynamic Programming (NDP) approach is proposed by Bertsekas and Tsitsiklis (1996). To get the solution of seriously nonlinear process control, the interest in NDP approach is enlarged and NDP algorithm is applied to diverse areas such as retailing, finance, inventory management, communication networks, etc. and it has been extended to chemical engineering parts. In the NDP approach, we select the optimal control input policy to minimize the value of cost which is calculated by the sum of current stage cost and future stages cost starting from the next state. The cost value is related with a weight square sum of error and input movement. During the calculation of optimal input policy, if the approximate cost function by using simulation data is utilized with Bellman iteration, the burden of calculation can be relieved and the curse of dimensionality problem of DP can be overcome. It is very important issue how to construct the cost-to-go function which has a good approximate performance. The neural network is one of the eager learning methods and it works as a global approximator to cost-to-go function. In this algorithm, the training of neural network is important and difficult part, and it gives significant effect on the performance of control. To avoid the difficulty in neural network training, the lazy learning method like k-nearest neighbor method can be exploited. The training is unnecessary for this method but requires more computation time and greater data storage. The pH neutralization process has long been taken as a representative benchmark problem of nonlin ar chemical process control due to its nonlinearity and time-varying nature. In this study, the NDP algorithm was applied to pH neutralization process. At first, the pH neutralization process control to use NDP algorithm was performed through simulations with various approximators. The global and local approximators are used for NDP calculation. After that, the verification of NDP in real system was made by pH neutralization experiment. The control results by NDP algorithm was compared with those by the PI controller which is traditionally used, in both simulations and experiments. From the comparison of results, the control by NDP algorithm showed faster and better control performance than PI controller. In addition to that, the control by NDP algorithm showed the good results when it applied to the cases with disturbances and multiple set point changes.
PDF

Neural Networks for Solving Linear Programming Problems and Linear Systems (선형계획 문제의 해를 구하는 신경회로)

Chang, S.H.;Kang, S.G.;Nam, B.H.;Lee, J.M.
- Proceedings of the KIEE Conference
- /
- 1993.07a
- /
- pp.221-223
- /
- 1993
The Hopfield model is defined as an adaptive dynamic system. In this paper we propose a modified neural network which is capable of solving linear programming problems and a set of linear equations. The model is directly implemented from the given system, and solves the problem without calculating the inverse of the matrices. We get the better stability results by the addition of scaling property and by using the nonlinearities in the linear programming neural networks.
PDF

Fast Pattern Classification with the Multi-layer Cellular Nonlinear Networks (CNN) (다층 셀룰라 비선형 회로망(CNN)을 이용한 고속 패턴 분류)

오태완;이혜정;손홍락;김형석
- The Transactions of the Korean Institute of Electrical Engineers D
- /
- v.52 no.9
- /
- pp.540-546
- /
- 2003
A fast pattern classification algorithm with Cellular Nonlinear Network-based dynamic programming is proposed. The Cellular Nonlinear Networks is an analog parallel processing architecture and the dynamic programing is an efficient computation algorithm for optimization problem. Combining merits of these two technologies, fast pattern classification with optimization is formed. On such CNN-based dynamic programming, if exemplars and test patterns are presented as the goals and the start positions, respectively, the optimal paths from test patterns to their closest exemplars are found. Such paths are utilized as aggregating keys for the classification. The algorithm is similar to the conventional neural network-based method in the use of the exemplar patterns but quite different in the use of the most likely path finding of the dynamic programming. The pattern classification is performed well regardless of degree of the nonlinearity in class borders.
PDF KSCI

Control of pH Neutralization Process using Simulation Based Dynamic Programming (ICCAS 2003)

Kim, Dong-Kyu;Yang, Dae-Ryook
- 제어로봇시스템학회:학술대회논문집
- /
- 2003.10a
- /
- pp.2617-2622
- /
- 2003
The pH neutralization process has long been taken as a representative benchmark problem of nonlinear chemical process control due to its nonlinearity and time-varying nature. For general nonlinear processes, it is difficult to control with a linear model-based control method so nonlinear controls must be considered. Among the numerous approaches suggested, the most rigorous approach is the dynamic optimization. However, as the size of the problem grows, the dynamic programming approach is suffered from the curse of dimensionality. In order to avoid this problem, the Neuro-Dynamic Programming (NDP) approach was proposed by Bertsekas and Tsitsiklis (1996). The NDP approach is to utilize all the data collected to generate an approximation of optimal cost-to-go function which was used to find the optimal input movement in real time control. The approximation could be any type of function such as polynomials, neural networks and etc. In this study, an algorithm using NDP approach was applied to a pH neutralization process to investigate the feasibility of the NDP algorithm and to deepen the understanding of the basic characteristics of this algorithm. As the global approximator, the neural network which requires training and k-nearest neighbor method which requires querying instead of training are investigated. The global approximator requires optimal control strategy. If the optimal control strategy is not available, suboptimal control strategy can be used even though the laborious Bellman iterations are necessary. For pH neutralization process it is rather easy to devise an optimal control strategy. Thus, we used an optimal control strategy and did not perform the Bellman iteration. Also, the effects of constraints on control moves are studied. From the simulations, the NDP method outperforms the conventional PID control.
PDF

A Propagation Programming Neural Network for Real-time matching of Stereo Images (스테레오 영상의 실시간 정합을 위한 보간 신경망 설계)

Kim, Jong-Man
- Proceedings of the Korean Institute of Electrical and Electronic Material Engineers Conference
- /
- 2003.05c
- /
- pp.194-199
- /
- 2003
Depth error correction effect for maladjusted stereo cameras with calibrated pixel distance parameter is presented. The proposed neural network technique is the real time computation method based theory of inter-node diffusion for searching the safety distances from the sudden appearance-objects during the work driving. The main steps of the distance computation using the theory of stereo vision like the eyes of man is following steps. One is the processing for finding the corresponding points of stereo images and the other is the interpolation processing of full image data from nonlinear image data of objects. All of them request much memory space and time. Therefore the most reliable neural-network algorithm is derived for real-time matching of objects, which is composed of a dynamic programming algorithm based on sequence matching techniques.
PDF

Search Result 36, Processing Time 0.031 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)