• 제목/요약/키워드: smoothing trust region method

검색결과 2건 처리시간 0.017초

A HYBRID METHOD FOR NCP WITH $P_0$ FUNCTIONS

  • Zhou, Qian;Ou, Yi-Gui
    • Journal of applied mathematics & informatics
    • /
    • 제29권3_4호
    • /
    • pp.653-668
    • /
    • 2011
  • This paper presents a new hybrid method for solving nonlinear complementarity problems with $P_0$-functions. It can be regarded as a combination of smoothing trust region method with ODE-based method and line search technique. A feature of the proposed method is that at each iteration, a linear system is only solved once to obtain a trial step, thus avoiding solving a trust region subproblem. Another is that when a trial step is not accepted, the method does not resolve the linear system but generates an iterative point whose step-length is defined by a line search. Under some conditions, the method is proven to be globally and superlinearly convergent. Preliminary numerical results indicate that the proposed method is promising.

경로 탐색 기법과 강화학습을 사용한 주먹 지르기동작 생성 기법 (Punching Motion Generation using Reinforcement Learning and Trajectory Search Method)

  • 박현준;최위동;장승호;홍정모
    • 한국멀티미디어학회논문지
    • /
    • 제21권8호
    • /
    • pp.969-981
    • /
    • 2018
  • Recent advances in machine learning approaches such as deep neural network and reinforcement learning offer significant performance improvements in generating detailed and varied motions in physically simulated virtual environments. The optimization methods are highly attractive because it allows for less understanding of underlying physics or mechanisms even for high-dimensional subtle control problems. In this paper, we propose an efficient learning method for stochastic policy represented as deep neural networks so that agent can generate various energetic motions adaptively to the changes of tasks and states without losing interactivity and robustness. This strategy could be realized by our novel trajectory search method motivated by the trust region policy optimization method. Our value-based trajectory smoothing technique finds stably learnable trajectories without consulting neural network responses directly. This policy is set as a trust region of the artificial neural network, so that it can learn the desired motion quickly.