• Title/Summary/Keyword: Reward calculation

Search Result 9, Processing Time 0.018 seconds

Reward Design of Reinforcement Learning for Development of Smart Control Algorithm (스마트 제어알고리즘 개발을 위한 강화학습 리워드 설계)

  • Kim, Hyun-Su;Yoon, Ki-Yong
    • Journal of Korean Association for Spatial Structures
    • /
    • v.22 no.2
    • /
    • pp.39-46
    • /
    • 2022
  • Recently, machine learning is widely used to solve optimization problems in various engineering fields. In this study, machine learning is applied to development of a control algorithm for a smart control device for reduction of seismic responses. For this purpose, Deep Q-network (DQN) out of reinforcement learning algorithms was employed to develop control algorithm. A single degree of freedom (SDOF) structure with a smart tuned mass damper (TMD) was used as an example structure. A smart TMD system was composed of MR (magnetorheological) damper instead of passive damper. Reward design of reinforcement learning mainly affects the control performance of the smart TMD. Various hyper-parameters were investigated to optimize the control performance of DQN-based control algorithm. Usually, decrease of the time step for numerical simulation is desirable to increase the accuracy of simulation results. However, the numerical simulation results presented that decrease of the time step for reward calculation might decrease the control performance of DQN-based control algorithm. Therefore, a proper time step for reward calculation should be selected in a DQN training process.

Employee Engagement and Motivation as Mediators between the Linkage of Reward with Employee Performance

  • SISWANTO, Siswanto;MAULIDIYAH, Zahrotul;MASYHURI, Masyhuri
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.8 no.2
    • /
    • pp.625-633
    • /
    • 2021
  • This study analyzes the impact of the reward variable on employees' performance through work motivation and employee engagement. This study's specific purpose is to investigate employee engagement's mediating role in the relationship between reward and employee performance. The sample of research is the employee at Sukorejo, Pasuruan Indonesia. The sample is permanent employees at manufacture corporate. The sample size is 150 employees of the total 759 workers through the calculation of the Slovin formula. Respondents in this study were employees with the criteria for having worked for at least last five years. The data obtained is in the form of answers from employees to the statements submitted. The data analysis was used structural equation modeling partial least square. To test the relationship between variables, it was equipped with a Sobel mediation test of statistics. SmartPLS 3.0 is used to help analyze the relationship between variables. The result shows that the reward does not have a direct influence on the performance of employees. However, it has a significant positive effect on the performance of employees through employee engagement. While working motivation variable does not have the role as a mediation variable related to the effect of reward on employee performance.

A Decentralized Task Structure for Cooperative Transportation Missions (협업 수송 임무을 위한 분산 임무 구조)

  • Kim, Keum-Seong;Choi, Han-Lim
    • The Journal of Korea Robotics Society
    • /
    • v.10 no.3
    • /
    • pp.133-138
    • /
    • 2015
  • This paper presents a modified task structure of coupled-constraints consensus based bundle algorithm especially to resolve the cooperative transportation problem. The cooperative transportation mission has various types of constraints. A modified framework to generate activities and subtasks to solve time and task constraints of the transportation mission by using coupled-constraints consensus based bundle algorithm is suggested. In this paper modifications on task structure, reward function and arrival time calculation are suggested to handle the constraints of cooperative transportation mission.

Analysis of Laminar Flow and Heat Transfer in Asymmetric, Sudden Expansion Channel (비대칭급확대채널의 층류유동 및 열전달 해석)

  • Won, Seung-Ho;Maeng, Joo-Sung;Son, Byung-Jin
    • The Magazine of the Society of Air-Conditioning and Refrigerating Engineers of Korea
    • /
    • v.13 no.1
    • /
    • pp.5-13
    • /
    • 1984
  • This analysis of numerical procedure is prediction of laminar flow and heat transfer at two dimension and steady flow in asymmetric sudden expansion channel. At former study, to analyse the flows with separation, the full Navier-Stokes equation is used, but there are many difficulties to analyse, and although significant progress has been made in the development of efficient computational methods for the Navier-Stokes equations, very large computation times are still required. In case of reward-facing flow, boundary-layer equation is used instead of full Navier-Stokes equation to analyse velocity fields, and result of this numerical analysis is good agreement with the given experimental study. In this case, since the computer time required for the boundary-layer calculation is an order of magnitude less than required for the solution of the full Navier-Stokes equation, this boundary-layer model provides a good approximate solution.

  • PDF

COMPENSATION STRUCTURE AND CONTINGENCY ALLOCATION IN INTEGRATED PROJECT DELIVERY SYSTEMS

  • Mei Liu;F. H. (Bud) Griffis;Andrew Bates
    • International conference on construction engineering and project management
    • /
    • 2013.01a
    • /
    • pp.338-343
    • /
    • 2013
  • Integrated Project Delivery (IPD) as a delivery method fully capitalizes on an integrated project team that takes advantage of the knowledge of all team members to maximize project outcomes. IPD is currently the highest form of collaboration available because all three core project stakeholders, owner, designer and contractor, are aligned to the same purpose. Compared with traditional project delivery approaches such as Design-Bid-Build (DBB), Design-Build (DB), and CM at-Risk, IPD is distinguished in that it eliminates the adversarial nature of the business by encouraging transparency, open communication, honesty and collaboration among all project stakeholders. The team appropriately shares the project risk and reward. Sharing reward is easy, while it is hard to fairly share a failure. So the compensation structure and the contingency in IPD are very different from those in traditional delivery methods and they are expected to encourage motivation, inspiration and creativity of all project stakeholders to achieve project success. This paper investigates the compensation structure in IPD and provides a method to determine the proper level of contingency allocation to reduce the risk of cost overrun. It also proposes a method in which contingency could be used as a functional monetary incentive when established to produce the desired level of collaboration in IPD. Based on the compensation structure scenario discovered, a probabilistic contingency calculation model was created by evaluating the random nature of changes and various risk drivers. The model can be used by the IPD team to forecast the probability of the cost overrun and equip the IPD team with confidence to really enjoy the benefits of collaborative team work.

  • PDF

The Effect of Segment Size on Quality Selection in DQN-based Video Streaming Services (DQN 기반 비디오 스트리밍 서비스에서 세그먼트 크기가 품질 선택에 미치는 영향)

  • Kim, ISeul;Lim, Kyungshik
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.10
    • /
    • pp.1182-1194
    • /
    • 2018
  • The Dynamic Adaptive Streaming over HTTP(DASH) is envisioned to evolve to meet an increasing demand on providing seamless video streaming services in the near future. The DASH performance heavily depends on the client's adaptive quality selection algorithm that is not included in the standard. The existing conventional algorithms are basically based on a procedural algorithm that is not easy to capture and reflect all variations of dynamic network and traffic conditions in a variety of network environments. To solve this problem, this paper proposes a novel quality selection mechanism based on the Deep Q-Network(DQN) model, the DQN-based DASH Adaptive Bitrate(ABR) mechanism. The proposed mechanism adopts a new reward calculation method based on five major performance metrics to reflect the current conditions of networks and devices in real time. In addition, the size of the consecutive video segment to be downloaded is also considered as a major learning metric to reflect a variety of video encodings. Experimental results show that the proposed mechanism quickly selects a suitable video quality even in high error rate environments, significantly reducing frequency of quality changes compared to the existing algorithm and simultaneously improving average video quality during video playback.

Calculation of Distribution Network Charging for DG Embedded Distribution System (분산전원 투입을 고려한 배전망 이용요금 산정에 관한 연구)

  • Hwang, Seok-Hyun;Kim, Mun-Kyeom;Park, Jong-Keun
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.61 no.4
    • /
    • pp.513-521
    • /
    • 2012
  • With the advent of smart grid, distribution network charges have been one of keystones of ongoing deregulation and privatization in power industries. This paper proposes a new charging methodology to allocate the existing distribution network cost with an aim of reflecting the true cost and benefit of network customers, especially of distribution generator (DG). The proposed charging methodology separates distribution network costs due to the respective real and reactive power flows. The costs are then allocated to network users according to each charge for the actual line capacity used and available capacity. This distribution network charging model is able to provide the economic signals to reward network users who are contributing to better power factors, while penalizing customers who worsen power factors. The proposed method is shown on IEEE 37 bus system for distribution network, and then the results are validated through the comparison with the MW-Miles and MVA-Miles methods. The charges derived from the proposed method can provide appropriate incentives/penalties to network customers to behave in a manner leading to a better network condition.

The effect of rewards on developing right user attitudes of elementary school children (보상이 초등학생의 게임 사용 습관에 미치는 영향)

  • Kim, Young-Joo;Kim, Hea Jin;Lee, Jung-Nyun;Whang, Mincheol
    • Journal of Korea Game Society
    • /
    • v.17 no.2
    • /
    • pp.27-34
    • /
    • 2017
  • The study is to drive right users' attitude of internet and smart phones by providing the rewards. It consisted of 4 different types of no compensation, praise card, achievement sticker and cash and its effect on user's behavior was statistically tested. 24 children in grades four through six participated in the study. The task in this study was game of mathematical calculation. The subjective satisfaction about the reward and heart response during the game task were measured. As the results, inactivation of sympathetic and parasympathetic was observed in the case of no compensation while activation in the case of praise card. Therefore, the praise card was observed in greater commitment and satisfaction than the other rewards. The difference between non-compensation and compensation was significant in the subjective satisfaction, but not difference between compensations.

Variations in Neural Correlates of Human Decision Making - a Case of Book Recommender Systems

  • Naveen Z. Quazilbash;Zaheeruddin Asif;Saman Rizvi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.3
    • /
    • pp.775-793
    • /
    • 2023
  • Human decision-making is a complex behavior. A replication of human decision making offers a potential to enhance the capacity of intelligent systems by providing additional user assistance in decision making. By reducing the effort and task complexity on behalf of the user, such replication would improve the overall user experience, and affect the degree of intelligence exhibited by the system. This paper explores individuals' decision-making processes when using recommender systems, and its related outcomes. In this study, human decision-making (HDM) refers to the selection of an item from a given set of options that are shown as recommendations to a user. The goal of our study was to identify IS constructs that contribute towards such decision-making, thereby contributing towards creating a mental model of HDM. This was achieved through recording Electroencephalographic (EEG) readings of subjects while they performed a decision-making activity. Readings from 16 righthanded healthy avid readers reflect that reward, theory of mind, risk, calculation, task intention, emotion, sense of touch, ambiguity and decision making are the primary constructs that users employ while deciding from a given set of recommendations in an online bookstore. In all 10 distinct brain areas were identified. These brain areas that lead to their respective constructs were found to be cingulate gyrus, precentral gyrus, inferior parietal lobule, posterior cingulate, medial frontal gyrus, anterior cingulate, postcentral gyrus, superior frontal gyrus, inferior frontal gyrus, and middle frontal gyrus (also referred to as dorsolateral prefrontal gyrus (DLPFC)). The identified constructs would help in developing a design theory for enhancing user assistance, especially in the context of recommender systems.