• Title/Summary/Keyword: evaluation & reward

Search Result 94, Processing Time 0.027 seconds

A Study of the Job Satisfaction of the Five-star Hotel Chef Career Choice Motives (특급호텔 조리사 직업선택동기에 따른 직무만족에 관한 연구)

  • Kim, Heon-Chul
    • Culinary science and hospitality research
    • /
    • v.23 no.3
    • /
    • pp.38-49
    • /
    • 2017
  • The purposes of this study were to examine the motivation of career choice for chefs in luxury hotels through IPA analysis based on the differences between importance and satisfaction, and to investigate the impact of motivations of career choice on job satisfaction. The results are as follows. First, "matched aptitude" and "interest in cook" were found as factors of importance and satisfaction. Second, sufficient reward should be provided through incentive, increasing salary, and promotion. Third, the system of internship and temporary employee should be revised, and the motivation should be provided through performance-based reward and incentive system. Fourth, fair personal evaluation, transparent promotion system, fair reward system should be required to increase job satisfaction. Fifth, internal environment is important rather than social reputation or popularity for those who consider chefs as future career.

Brain Mechanisms of Cognitive, Emotional and Behavioral Aspects of Taste

  • Yamamoto, Takashi
    • International Journal of Oral Biology
    • /
    • v.34 no.3
    • /
    • pp.123-129
    • /
    • 2009
  • Taste is associated with hedonic evaluation as well as recognition of quality and intensity. Taste information is sent to the cortical gustatory area in a chemotopical manner to be processed for discrimination of taste quality. It is also conveyed to the reward system and feeding center via the prefrontal cortices. The amygdala, which receives taste inputs, also influences reward and feeding. In terms of neuroactive substances, palatability is closely related to benzodiazepine derivatives and $\beta$-endorphin, both of which facilitate consumption of food and fluid. The reward system contains the ventral tegmental area, nucleus accumbens and ventral pallidum and finally sends information to the lateral hypothalamic area, the feeding center. The dopaminergic system originating from the ventral tegmental area mediates the motivation to consume palatable food. The actual ingestive behavior is promoted by the orexigenic neuropeptides from the hypothalamus. Even palatable food can become aversive and avoided as a consequence of postingestional unpleasant experience such as malaise. The brain mechanism of these aspects of taste is elucidated.

Capacitated Fab Scheduling Approximation using Average Reward TD(${\lambda}$) Learning based on System Feature Functions (시스템 특성함수 기반 평균보상 TD(${\lambda}$) 학습을 통한 유한용량 Fab 스케줄링 근사화)

  • Choi, Jin-Young
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.34 no.4
    • /
    • pp.189-196
    • /
    • 2011
  • In this paper, we propose a logical control-based actor-critic algorithm as an efficient approach for the approximation of the capacitated fab scheduling problem. We apply the average reward temporal-difference learning method for estimating the relative value functions of system states, while avoiding deadlock situation by Banker's algorithm. We consider the Intel mini-fab re-entrant line for the evaluation of the suggested algorithm and perform a numerical experiment by generating some sample system configurations randomly. We show that the suggested method has a prominent performance compared to other well-known heuristics.

Comparative analysis of activation functions within reinforcement learning for autonomous vehicles merging onto highways

  • Dongcheul Lee;Janise McNair
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.16 no.1
    • /
    • pp.63-71
    • /
    • 2024
  • Deep reinforcement learning (RL) significantly influences autonomous vehicle development by optimizing decision-making and adaptation to complex driving environments through simulation-based training. In deep RL, an activation function is used, and various activation functions have been proposed, but their performance varies greatly depending on the application environment. Therefore, finding the optimal activation function according to the environment is important for effective learning. In this paper, we analyzed nine commonly used activation functions for RL to compare and evaluate which activation function is most effective when using deep RL for autonomous vehicles to learn highway merging. To do this, we built a performance evaluation environment and compared the average reward of each activation function. The results showed that the highest reward was achieved using Mish, and the lowest using SELU. The difference in reward between the two activation functions was 10.3%.

Availability Analysis of Redundancy Models for Network System with Non-Stop Forwarding (논스톱 포워딩 기능을 지원하는 네트워크 시스템에 대한 다중화 모형의 가용도 분석)

  • Shim, Jaechan;Ryu, Hongrim;Ryu, Hoyong;Park, Jaehyung;Lee, Yutae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.12
    • /
    • pp.2828-2835
    • /
    • 2015
  • In this paper, we analyse the effect of redundancy types and non-stop forwarding scheme on network service availability. We use stochastic reward net models as enabling modeling approach for the analytical evaluation. We first design stochastic reward nets for redundancy models with or without non-stop forwarding and then evaluate their availability using Stochastic Petri Net Package.

Speech enhancement based on reinforcement learning (강화학습 기반의 음성향상기법)

  • Park, Tae-Jun;Chang, Joon-Hyuk
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.05a
    • /
    • pp.335-337
    • /
    • 2018
  • 음성향상기법은 음성에 포함된 잡음이나 잔향을 제거하는 기술로써 마이크로폰으로 입력된 음성신호는 잡음이나 잔향에 의해 왜곡되어지므로 음성인식, 음성통신 등의 음성신호처리 기술의 핵심 기술이다. 이전에는 음성신호와 잡음신호 사이의 통계적 정보를 이용하는 통계모델 기반의 음성향상기법이 주로 사용되었으나 통계 모델 기반의 음성향상기술은 정상 잡음 환경과는 달리 비정상 잡음 환경에서 성능이 크게 저하되는 문제점을 가지고 있었다. 최근 머신러닝 기법인 심화신경망 (DNN, deep neural network)이 도입되어 음성 향상 기법에서 우수한 성능을 내고 있다. 심화신경망을 이용한 음성 향상 기법은 다수의 은닉 층과 은닉 노드들을 통하여 잡음이 존재하는 음성 신호와 잡음이 존재하지 않는 깨끗한 음성 신호 사이의 비선형적인 관계를 잘 모델링하였다. 이러한 심화신경망 기반의 음성향상기법을 향상 시킬 수 있는 방법 중 하나인 강화학습을 적용하여 기존 심화신경망 대비 성능을 향상시켰다. 강화학습이란 대표적으로 구글의 알파고에 적용된 기술로써 특정 state에서 최고의 reward를 받기 위해 어떠한 policy를 통한 action을 취해서 다음 state로 나아갈지를 매우 많은 경우에 대해 학습을 통해 최적의 action을 선택할 수 있도록 학습하는 방법을 말한다. 본 논문에서는 composite measure를 기반으로 reward를 설계하여 기존 PESQ (Perceptual Evaluation of Speech Quality) 기반의 reward를 설계한 기술 대비 음성인식 성능을 높였다.

Comparison of Reinforcement Learning Activation Functions to Maximize Rewards in Autonomous Highway Driving (고속도로 자율주행 시 보상을 최대화하기 위한 강화 학습 활성화 함수 비교)

  • Lee, Dongcheul
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.5
    • /
    • pp.63-68
    • /
    • 2022
  • Autonomous driving technology has recently made great progress with the introduction of deep reinforcement learning. In order to effectively use deep reinforcement learning, it is important to select the appropriate activation function. In the meantime, many activation functions have been presented, but they show different performance depending on the environment to be applied. This paper compares and evaluates the performance of 12 activation functions to see which activation functions are effective when using reinforcement learning to learn autonomous driving on highways. To this end, a performance evaluation method was presented and the average reward value of each activation function was compared. As a result, when using GELU, the highest average reward could be obtained, and SiLU showed the lowest performance. The average reward difference between the two activation functions was 20%.

A Quantitative Analysis Theory for Reliability of Software (소프트웨어 신뢰성의 정량적 분석 방법론)

  • Cho, Yong-Soon;Youn, Hyun-Sang;Lee, Eun-Seok
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.7
    • /
    • pp.500-504
    • /
    • 2009
  • A reliability of software is a type of nonfunctional requirement. Traditionally, a validation of the reliability is processed at the integration phase in software development life cycle. However, it increases the cost and the risk for the development. In this paper, we propose reliability analysis method based on mathematical analytic model at the architecture design phase of the development process as follows. First, we propose the software modeling methodology for reliability analysis using Hierarchical combined Queueing Petri Nets(HQPN). Second, we derive the Markov Reward Model from the HQPN based model. We apply our approach to the video conference system to verify the usefulness of our approach. Our approach supports quantitative evaluation of the reliability.

Evaluating SR-Based Reinforcement Learning Algorithm Under the Highly Uncertain Decision Task (불확실성이 높은 의사결정 환경에서 SR 기반 강화학습 알고리즘의 성능 분석)

  • Kim, So Hyeon;Lee, Jee Hang
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.8
    • /
    • pp.331-338
    • /
    • 2022
  • Successor representation (SR) is a model of human reinforcement learning (RL) mimicking the underlying mechanism of hippocampal cells constructing cognitive maps. SR utilizes these learned features to adaptively respond to the frequent reward changes. In this paper, we evaluated the performance of SR under the context where changes in latent variables of environments trigger the reward structure changes. For a benchmark test, we adopted SR-Dyna, an integration of SR into goal-driven Dyna RL algorithm in the 2-stage Markov Decision Task (MDT) in which we can intentionally manipulate the latent variables - state transition uncertainty and goal-condition. To precisely investigate the characteristics of SR, we conducted the experiments while controlling each latent variable that affects the changes in reward structure. Evaluation results showed that SR-Dyna could learn to respond to the reward changes in relation to the changes in latent variables, but could not learn rapidly in that situation. This brings about the necessity to build more robust RL models that can rapidly learn to respond to the frequent changes in the environment in which latent variables and reward structure change at the same time.

Deep Reinforcement Learning of Ball Throwing Robot's Policy Prediction (공 던지기 로봇의 정책 예측 심층 강화학습)

  • Kang, Yeong-Gyun;Lee, Cheol-Soo
    • The Journal of Korea Robotics Society
    • /
    • v.15 no.4
    • /
    • pp.398-403
    • /
    • 2020
  • Robot's throwing control is difficult to accurately calculate because of air resistance and rotational inertia, etc. This complexity can be solved by using machine learning. Reinforcement learning using reward function puts limit on adapting to new environment for robots. Therefore, this paper applied deep reinforcement learning using neural network without reward function. Throwing is evaluated as a success or failure. AI network learns by taking the target position and control policy as input and yielding the evaluation as output. Then, the task is carried out by predicting the success probability according to the target location and control policy and searching the policy with the highest probability. Repeating this task can result in performance improvements as data accumulates. And this model can even predict tasks that were not previously attempted which means it is an universally applicable learning model for any new environment. According to the data results from 520 experiments, this learning model guarantees 75% success rate.