• Title/Summary/Keyword: Reward Policy

Search Result 129, Processing Time 0.021 seconds

Antecedents of Empowerment: A Comparative Study by Occupations of National University Hospital Employees (임파워먼트의 선행요인: 국립대 병원근로자들의 직종별 비교 연구)

  • Yoon Bang Seob;Seo Young Joon
    • Health Policy and Management
    • /
    • v.15 no.1
    • /
    • pp.1-29
    • /
    • 2005
  • This study examined the antecedents of psychological empowerment at hospital organizations, and also examined their differential effects among occupational groups within hospitals: doctors, nurses, engineers, and administrative workers. Various variables for multi-level factors were examined as antecedents: tenure, income, work centrality, and career goal as personal factors, job variety, job clarity, job significance, and job fitness as job factors, and security, reward justice, and organizational support as organizational factors. Data were collected from 8 national university hospitals, and 1,289 data were used for final analysis. For the whole groups, all antecedents except reward justice had significant effects on, and explained large amount of variance of empowerment. Results from the analysis for each occupational group showed that income, career goal, and job significance had significant effects on empowerment at all occupational groups, while reward justice had not at any groups. The effects of other variables depended on occupational groups. 1bis study found some important antecedents of empowerment which have been less considered in previous research: career goal, work centrality, security, and organizational support. The finding that differential effects of antecedents on empowerment by occupational groups suggests that group characteristics should be considered for studying empowerment. In this study, for example, personal factors rather than both job factors and organizational factors were more effective for empowerment in the engineering group whose job is relatively simple and clear, while job factors were most effective in other groups. The differential effects of antecedents on empowerment by occupational groups also have practical implications for improvement of empowerment at hospitals. For empowerment, personnel management efforts would be more required for administrative workers than other occupational groups, because they perceived least job clarity, job significance, job fitness among the groups, all of which were found to be important determinants of empowerment for them.

PSYCHOLOGICAL ANALYSIS AND RESEARCH ON WESTERNERS EFFECT IN HUMANISTIC GAME (인본주의 게임에서의 데시효과(Westerners effect)를 활용하는 심리적 측면에서 분석 및 연구)

  • Li, Xuan-Xin
    • Journal of Digital Convergence
    • /
    • v.18 no.1
    • /
    • pp.295-300
    • /
    • 2020
  • The psychologist Dessie said: excessive rewards may reduce an individual's interest in things and their intrinsic motivation. This is known as the Westerners effect. In this paper, we will examine the game from the point of view of humanism and the Westerners effect. Finally, three reasonable ways of reward are put forward. In the future, the research direction will use the theory of this paper to set the reward way after completing the task in the game. And give players a trial report on the emotional feedback in the finished game, and then continue to improve the theory of this article.

Speech enhancement based on reinforcement learning (강화학습 기반의 음성향상기법)

  • Park, Tae-Jun;Chang, Joon-Hyuk
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.05a
    • /
    • pp.335-337
    • /
    • 2018
  • 음성향상기법은 음성에 포함된 잡음이나 잔향을 제거하는 기술로써 마이크로폰으로 입력된 음성신호는 잡음이나 잔향에 의해 왜곡되어지므로 음성인식, 음성통신 등의 음성신호처리 기술의 핵심 기술이다. 이전에는 음성신호와 잡음신호 사이의 통계적 정보를 이용하는 통계모델 기반의 음성향상기법이 주로 사용되었으나 통계 모델 기반의 음성향상기술은 정상 잡음 환경과는 달리 비정상 잡음 환경에서 성능이 크게 저하되는 문제점을 가지고 있었다. 최근 머신러닝 기법인 심화신경망 (DNN, deep neural network)이 도입되어 음성 향상 기법에서 우수한 성능을 내고 있다. 심화신경망을 이용한 음성 향상 기법은 다수의 은닉 층과 은닉 노드들을 통하여 잡음이 존재하는 음성 신호와 잡음이 존재하지 않는 깨끗한 음성 신호 사이의 비선형적인 관계를 잘 모델링하였다. 이러한 심화신경망 기반의 음성향상기법을 향상 시킬 수 있는 방법 중 하나인 강화학습을 적용하여 기존 심화신경망 대비 성능을 향상시켰다. 강화학습이란 대표적으로 구글의 알파고에 적용된 기술로써 특정 state에서 최고의 reward를 받기 위해 어떠한 policy를 통한 action을 취해서 다음 state로 나아갈지를 매우 많은 경우에 대해 학습을 통해 최적의 action을 선택할 수 있도록 학습하는 방법을 말한다. 본 논문에서는 composite measure를 기반으로 reward를 설계하여 기존 PESQ (Perceptual Evaluation of Speech Quality) 기반의 reward를 설계한 기술 대비 음성인식 성능을 높였다.

Designing an Efficient Reward Function for Robot Reinforcement Learning of The Water Bottle Flipping Task (보틀플리핑의 로봇 강화학습을 위한 효과적인 보상 함수의 설계)

  • Yang, Young-Ha;Lee, Sang-Hyeok;Lee, Cheol-Soo
    • The Journal of Korea Robotics Society
    • /
    • v.14 no.2
    • /
    • pp.81-86
    • /
    • 2019
  • Robots are used in various industrial sites, but traditional methods of operating a robot are limited at some kind of tasks. In order for a robot to accomplish a task, it is needed to find and solve accurate formula between a robot and environment and that is complicated work. Accordingly, reinforcement learning of robots is actively studied to overcome this difficulties. This study describes the process and results of learning and solving which applied reinforcement learning. The mission that the robot is going to learn is bottle flipping. Bottle flipping is an activity that involves throwing a plastic bottle in an attempt to land it upright on its bottom. Complexity of movement of liquid in the bottle when it thrown in the air, makes this task difficult to solve in traditional ways. Reinforcement learning process makes it easier. After 3-DOF robotic arm being instructed how to throwing the bottle, the robot find the better motion that make successful with the task. Two reward functions are designed and compared the result of learning. Finite difference method is used to obtain policy gradient. This paper focuses on the process of designing an efficient reward function to improve bottle flipping motion.

Autonomous and Asynchronous Triggered Agent Exploratory Path-planning Via a Terrain Clutter-index using Reinforcement Learning

  • Kim, Min-Suk;Kim, Hwankuk
    • Journal of information and communication convergence engineering
    • /
    • v.20 no.3
    • /
    • pp.181-188
    • /
    • 2022
  • An intelligent distributed multi-agent system (IDMS) using reinforcement learning (RL) is a challenging and intricate problem in which single or multiple agent(s) aim to achieve their specific goals (sub-goal and final goal), where they move their states in a complex and cluttered environment. The environment provided by the IDMS provides a cumulative optimal reward for each action based on the policy of the learning process. Most actions involve interacting with a given IDMS environment; therefore, it can provide the following elements: a starting agent state, multiple obstacles, agent goals, and a cluttered index. The reward in the environment is also reflected by RL-based agents, in which agents can move randomly or intelligently to reach their respective goals, to improve the agent learning performance. We extend different cases of intelligent multi-agent systems from our previous works: (a) a proposed environment-clutter-based-index for agent sub-goal selection and analysis of its effect, and (b) a newly proposed RL reward scheme based on the environmental clutter-index to identify and analyze the prerequisites and conditions for improving the overall system.

Determination of Ship Collision Avoidance Path using Deep Deterministic Policy Gradient Algorithm (심층 결정론적 정책 경사법을 이용한 선박 충돌 회피 경로 결정)

  • Kim, Dong-Ham;Lee, Sung-Uk;Nam, Jong-Ho;Furukawa, Yoshitaka
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.56 no.1
    • /
    • pp.58-65
    • /
    • 2019
  • The stability, reliability and efficiency of a smart ship are important issues as the interest in an autonomous ship has recently been high. An automatic collision avoidance system is an essential function of an autonomous ship. This system detects the possibility of collision and automatically takes avoidance actions in consideration of economy and safety. In order to construct an automatic collision avoidance system using reinforcement learning, in this work, the sequential decision problem of ship collision is mathematically formulated through a Markov Decision Process (MDP). A reinforcement learning environment is constructed based on the ship maneuvering equations, and then the three key components (state, action, and reward) of MDP are defined. The state uses parameters of the relationship between own-ship and target-ship, the action is the vertical distance away from the target course, and the reward is defined as a function considering safety and economics. In order to solve the sequential decision problem, the Deep Deterministic Policy Gradient (DDPG) algorithm which can express continuous action space and search an optimal action policy is utilized. The collision avoidance system is then tested assuming the $90^{\circ}$intersection encounter situation and yields a satisfactory result.

Q-Learning Policy Design to Speed Up Agent Training (에이전트 학습 속도 향상을 위한 Q-Learning 정책 설계)

  • Yong, Sung-jung;Park, Hyo-gyeong;You, Yeon-hwi;Moon, Il-young
    • Journal of Practical Engineering Education
    • /
    • v.14 no.1
    • /
    • pp.219-224
    • /
    • 2022
  • Q-Learning is a technique widely used as a basic algorithm for reinforcement learning. Q-Learning trains the agent in the direction of maximizing the reward through the greedy action that selects the largest value among the rewards of the actions that can be taken in the current state. In this paper, we studied a policy that can speed up agent training using Q-Learning in Frozen Lake 8×8 grid environment. In addition, the training results of the existing algorithm of Q-learning and the algorithm that gave the attribute 'direction' to agent movement were compared. As a result, it was analyzed that the Q-Learning policy proposed in this paper can significantly increase both the accuracy and training speed compared to the general algorithm.

Localization and a Distributed Local Optimal Solution Algorithm for a Class of Multi-Agent Markov Decision Processes

  • Chang, Hyeong-Soo
    • International Journal of Control, Automation, and Systems
    • /
    • v.1 no.3
    • /
    • pp.358-367
    • /
    • 2003
  • We consider discrete-time factorial Markov Decision Processes (MDPs) in multiple decision-makers environment for infinite horizon average reward criterion with a general joint reward structure but a factorial joint state transition structure. We introduce the "localization" concept that a global MDP is localized for each agent such that each agent needs to consider a local MDP defined only with its own state and action spaces. Based on that, we present a gradient-ascent like iterative distributed algorithm that converges to a local optimal solution of the global MDP. The solution is an autonomous joint policy in that each agent's decision is based on only its local state.cal state.

The Effect of the Youth perceived importance of Entrepreneurship Education on the self-leadership strategy (청년층의 창업교육 인지도가 셀프리더십 전략수준에 미치는 영향)

  • Kim, Yeon-Jeong
    • Journal of Digital Convergence
    • /
    • v.12 no.12
    • /
    • pp.77-85
    • /
    • 2014
  • This study investigate the youth perceived importance of entrepreneurship education program on the self leadership strategy. The findings are as follows: First, the creativity category of entrepreneurship education were positively related with behavior focused self leadership, natural reward self leadership strategy and constructive thought self leadership. Second, the management category of entrepreneurship education were positively related with natural reward self leadership strategy. Third, the patent category of entrepreneurship education were positively related with behavior focused self leadership and constructive thought self leadership. Consequently, when the youth recognized the importance of management, creativity and patent category, the level of self control, self reward and self efficacy were increased.

A effects of behavior type of dance art instructors in elementary on class satisfaction (무용 예술강사의 교수행동유형이 초등 수업만족도에 미치는 영향)

  • Woo, Jung-Wook;Lee, Eun-Joo
    • Journal of Digital Convergence
    • /
    • v.18 no.2
    • /
    • pp.455-462
    • /
    • 2020
  • The purpose of this study was to investigate the effect of behavior type of dance art instructors in elementary dance education on class satisfaction focused on mediations of flow and perceived competence between commanding and positive reward types. A total of 453 questionnaires were analyzed for this study. For the analysis of the data, SPSS 18.0 version was used and double mediation model operating in serial, proposed by Hayes and a bootstrapping method were used. First, instructor's commanding type and positive reward type were statistically positive effect on class satisfaction. Second, the indirect effect of instructor's commanding type, and positive reward type on class satisfaction through the flow was statistically positive significant. Third, the indirect effect of instructor's commanding type, and reward type on class satisfaction through the perceived competence was statistically not significant. Lastly, the indirect effect of instructor's commanding type, and positive reward type on class satisfaction through the flow and perceived competence was statistically positive significant.