• 제목/요약/키워드: Reward calculation

검색결과 9건 처리시간 0.026초

스마트 제어알고리즘 개발을 위한 강화학습 리워드 설계 (Reward Design of Reinforcement Learning for Development of Smart Control Algorithm)

  • 김현수;윤기용
    • 한국공간구조학회논문집
    • /
    • 제22권2호
    • /
    • pp.39-46
    • /
    • 2022
  • Recently, machine learning is widely used to solve optimization problems in various engineering fields. In this study, machine learning is applied to development of a control algorithm for a smart control device for reduction of seismic responses. For this purpose, Deep Q-network (DQN) out of reinforcement learning algorithms was employed to develop control algorithm. A single degree of freedom (SDOF) structure with a smart tuned mass damper (TMD) was used as an example structure. A smart TMD system was composed of MR (magnetorheological) damper instead of passive damper. Reward design of reinforcement learning mainly affects the control performance of the smart TMD. Various hyper-parameters were investigated to optimize the control performance of DQN-based control algorithm. Usually, decrease of the time step for numerical simulation is desirable to increase the accuracy of simulation results. However, the numerical simulation results presented that decrease of the time step for reward calculation might decrease the control performance of DQN-based control algorithm. Therefore, a proper time step for reward calculation should be selected in a DQN training process.

Employee Engagement and Motivation as Mediators between the Linkage of Reward with Employee Performance

  • SISWANTO, Siswanto;MAULIDIYAH, Zahrotul;MASYHURI, Masyhuri
    • The Journal of Asian Finance, Economics and Business
    • /
    • 제8권2호
    • /
    • pp.625-633
    • /
    • 2021
  • This study analyzes the impact of the reward variable on employees' performance through work motivation and employee engagement. This study's specific purpose is to investigate employee engagement's mediating role in the relationship between reward and employee performance. The sample of research is the employee at Sukorejo, Pasuruan Indonesia. The sample is permanent employees at manufacture corporate. The sample size is 150 employees of the total 759 workers through the calculation of the Slovin formula. Respondents in this study were employees with the criteria for having worked for at least last five years. The data obtained is in the form of answers from employees to the statements submitted. The data analysis was used structural equation modeling partial least square. To test the relationship between variables, it was equipped with a Sobel mediation test of statistics. SmartPLS 3.0 is used to help analyze the relationship between variables. The result shows that the reward does not have a direct influence on the performance of employees. However, it has a significant positive effect on the performance of employees through employee engagement. While working motivation variable does not have the role as a mediation variable related to the effect of reward on employee performance.

협업 수송 임무을 위한 분산 임무 구조 (A Decentralized Task Structure for Cooperative Transportation Missions)

  • 김금성;최한림
    • 로봇학회논문지
    • /
    • 제10권3호
    • /
    • pp.133-138
    • /
    • 2015
  • This paper presents a modified task structure of coupled-constraints consensus based bundle algorithm especially to resolve the cooperative transportation problem. The cooperative transportation mission has various types of constraints. A modified framework to generate activities and subtasks to solve time and task constraints of the transportation mission by using coupled-constraints consensus based bundle algorithm is suggested. In this paper modifications on task structure, reward function and arrival time calculation are suggested to handle the constraints of cooperative transportation mission.

비대칭급확대채널의 층류유동 및 열전달 해석 (Analysis of Laminar Flow and Heat Transfer in Asymmetric, Sudden Expansion Channel)

  • 원승호;맹주성;손병진
    • 대한설비공학회지:설비저널
    • /
    • 제13권1호
    • /
    • pp.5-13
    • /
    • 1984
  • This analysis of numerical procedure is prediction of laminar flow and heat transfer at two dimension and steady flow in asymmetric sudden expansion channel. At former study, to analyse the flows with separation, the full Navier-Stokes equation is used, but there are many difficulties to analyse, and although significant progress has been made in the development of efficient computational methods for the Navier-Stokes equations, very large computation times are still required. In case of reward-facing flow, boundary-layer equation is used instead of full Navier-Stokes equation to analyse velocity fields, and result of this numerical analysis is good agreement with the given experimental study. In this case, since the computer time required for the boundary-layer calculation is an order of magnitude less than required for the solution of the full Navier-Stokes equation, this boundary-layer model provides a good approximate solution.

  • PDF

COMPENSATION STRUCTURE AND CONTINGENCY ALLOCATION IN INTEGRATED PROJECT DELIVERY SYSTEMS

  • Mei Liu;F. H. (Bud) Griffis;Andrew Bates
    • 국제학술발표논문집
    • /
    • The 5th International Conference on Construction Engineering and Project Management
    • /
    • pp.338-343
    • /
    • 2013
  • Integrated Project Delivery (IPD) as a delivery method fully capitalizes on an integrated project team that takes advantage of the knowledge of all team members to maximize project outcomes. IPD is currently the highest form of collaboration available because all three core project stakeholders, owner, designer and contractor, are aligned to the same purpose. Compared with traditional project delivery approaches such as Design-Bid-Build (DBB), Design-Build (DB), and CM at-Risk, IPD is distinguished in that it eliminates the adversarial nature of the business by encouraging transparency, open communication, honesty and collaboration among all project stakeholders. The team appropriately shares the project risk and reward. Sharing reward is easy, while it is hard to fairly share a failure. So the compensation structure and the contingency in IPD are very different from those in traditional delivery methods and they are expected to encourage motivation, inspiration and creativity of all project stakeholders to achieve project success. This paper investigates the compensation structure in IPD and provides a method to determine the proper level of contingency allocation to reduce the risk of cost overrun. It also proposes a method in which contingency could be used as a functional monetary incentive when established to produce the desired level of collaboration in IPD. Based on the compensation structure scenario discovered, a probabilistic contingency calculation model was created by evaluating the random nature of changes and various risk drivers. The model can be used by the IPD team to forecast the probability of the cost overrun and equip the IPD team with confidence to really enjoy the benefits of collaborative team work.

  • PDF

DQN 기반 비디오 스트리밍 서비스에서 세그먼트 크기가 품질 선택에 미치는 영향 (The Effect of Segment Size on Quality Selection in DQN-based Video Streaming Services)

  • 김이슬;임경식
    • 한국멀티미디어학회논문지
    • /
    • 제21권10호
    • /
    • pp.1182-1194
    • /
    • 2018
  • The Dynamic Adaptive Streaming over HTTP(DASH) is envisioned to evolve to meet an increasing demand on providing seamless video streaming services in the near future. The DASH performance heavily depends on the client's adaptive quality selection algorithm that is not included in the standard. The existing conventional algorithms are basically based on a procedural algorithm that is not easy to capture and reflect all variations of dynamic network and traffic conditions in a variety of network environments. To solve this problem, this paper proposes a novel quality selection mechanism based on the Deep Q-Network(DQN) model, the DQN-based DASH Adaptive Bitrate(ABR) mechanism. The proposed mechanism adopts a new reward calculation method based on five major performance metrics to reflect the current conditions of networks and devices in real time. In addition, the size of the consecutive video segment to be downloaded is also considered as a major learning metric to reflect a variety of video encodings. Experimental results show that the proposed mechanism quickly selects a suitable video quality even in high error rate environments, significantly reducing frequency of quality changes compared to the existing algorithm and simultaneously improving average video quality during video playback.

분산전원 투입을 고려한 배전망 이용요금 산정에 관한 연구 (Calculation of Distribution Network Charging for DG Embedded Distribution System)

  • 황석현;김문겸;박종근
    • 전기학회논문지
    • /
    • 제61권4호
    • /
    • pp.513-521
    • /
    • 2012
  • With the advent of smart grid, distribution network charges have been one of keystones of ongoing deregulation and privatization in power industries. This paper proposes a new charging methodology to allocate the existing distribution network cost with an aim of reflecting the true cost and benefit of network customers, especially of distribution generator (DG). The proposed charging methodology separates distribution network costs due to the respective real and reactive power flows. The costs are then allocated to network users according to each charge for the actual line capacity used and available capacity. This distribution network charging model is able to provide the economic signals to reward network users who are contributing to better power factors, while penalizing customers who worsen power factors. The proposed method is shown on IEEE 37 bus system for distribution network, and then the results are validated through the comparison with the MW-Miles and MVA-Miles methods. The charges derived from the proposed method can provide appropriate incentives/penalties to network customers to behave in a manner leading to a better network condition.

보상이 초등학생의 게임 사용 습관에 미치는 영향 (The effect of rewards on developing right user attitudes of elementary school children)

  • 김영주;김혜진;이정년;황민철
    • 한국게임학회 논문지
    • /
    • 제17권2호
    • /
    • pp.27-34
    • /
    • 2017
  • 본 논문은 게임에 대한 사용자의 직접적인 행동제어 방법보다는 사용 습관의 변화에 대한 적절한 보상을 제시함으로 올바른 사용을 유도하고자 하였다. 본 연구에서는 제시된 보상의 몰입 효과를 분석하여 보상 효과를 검증하였다. 24명의 초등학생들이 실험에 참여하였다. 피험자에게 제공한 태스크(task)는 산술 게임이며, 결과에 따라 4종류의 보상(보상없음, 칭찬카드, 업적 스티커, 현금)의 효과를 분석하였다. 산술 게임하는 동안에는 심장 반응과 보상에 대한 주관 만족도를 측정하였다. 결과적으로, 보상이 없는 경우는 교감, 부교감의 비활성화를 나타내었으며, 칭찬카드를 제시했을 경우, 교감 및 부교감 활성화를 확인하였다. 즉, 칭찬카드가 다른 보상보다도 가장 몰입 및 만족도가 크다는 것을 확인할 수 있었다. 하지만, 보상에 따른 주관적인 만족도에서는 보상없음과 보상을 제시한 경우의 통계적 유의한 차이는 있었지만, 보상간의 차이는 확인할 수 없었다.

Variations in Neural Correlates of Human Decision Making - a Case of Book Recommender Systems

  • Naveen Z. Quazilbash;Zaheeruddin Asif;Saman Rizvi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권3호
    • /
    • pp.775-793
    • /
    • 2023
  • Human decision-making is a complex behavior. A replication of human decision making offers a potential to enhance the capacity of intelligent systems by providing additional user assistance in decision making. By reducing the effort and task complexity on behalf of the user, such replication would improve the overall user experience, and affect the degree of intelligence exhibited by the system. This paper explores individuals' decision-making processes when using recommender systems, and its related outcomes. In this study, human decision-making (HDM) refers to the selection of an item from a given set of options that are shown as recommendations to a user. The goal of our study was to identify IS constructs that contribute towards such decision-making, thereby contributing towards creating a mental model of HDM. This was achieved through recording Electroencephalographic (EEG) readings of subjects while they performed a decision-making activity. Readings from 16 righthanded healthy avid readers reflect that reward, theory of mind, risk, calculation, task intention, emotion, sense of touch, ambiguity and decision making are the primary constructs that users employ while deciding from a given set of recommendations in an online bookstore. In all 10 distinct brain areas were identified. These brain areas that lead to their respective constructs were found to be cingulate gyrus, precentral gyrus, inferior parietal lobule, posterior cingulate, medial frontal gyrus, anterior cingulate, postcentral gyrus, superior frontal gyrus, inferior frontal gyrus, and middle frontal gyrus (also referred to as dorsolateral prefrontal gyrus (DLPFC)). The identified constructs would help in developing a design theory for enhancing user assistance, especially in the context of recommender systems.