• Title/Summary/Keyword: smoothing trust region method

Search Result 2, Processing Time 0.015 seconds

A HYBRID METHOD FOR NCP WITH $P_0$ FUNCTIONS

  • Zhou, Qian;Ou, Yi-Gui
    • Journal of applied mathematics & informatics
    • /
    • v.29 no.3_4
    • /
    • pp.653-668
    • /
    • 2011
  • This paper presents a new hybrid method for solving nonlinear complementarity problems with $P_0$-functions. It can be regarded as a combination of smoothing trust region method with ODE-based method and line search technique. A feature of the proposed method is that at each iteration, a linear system is only solved once to obtain a trial step, thus avoiding solving a trust region subproblem. Another is that when a trial step is not accepted, the method does not resolve the linear system but generates an iterative point whose step-length is defined by a line search. Under some conditions, the method is proven to be globally and superlinearly convergent. Preliminary numerical results indicate that the proposed method is promising.

Punching Motion Generation using Reinforcement Learning and Trajectory Search Method (경로 탐색 기법과 강화학습을 사용한 주먹 지르기동작 생성 기법)

  • Park, Hyun-Jun;Choi, WeDong;Jang, Seung-Ho;Hong, Jeong-Mo
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.8
    • /
    • pp.969-981
    • /
    • 2018
  • Recent advances in machine learning approaches such as deep neural network and reinforcement learning offer significant performance improvements in generating detailed and varied motions in physically simulated virtual environments. The optimization methods are highly attractive because it allows for less understanding of underlying physics or mechanisms even for high-dimensional subtle control problems. In this paper, we propose an efficient learning method for stochastic policy represented as deep neural networks so that agent can generate various energetic motions adaptively to the changes of tasks and states without losing interactivity and robustness. This strategy could be realized by our novel trajectory search method motivated by the trust region policy optimization method. Our value-based trajectory smoothing technique finds stably learnable trajectories without consulting neural network responses directly. This policy is set as a trust region of the artificial neural network, so that it can learn the desired motion quickly.