• 제목/요약/키워드: Reward Value

검색결과 109건 처리시간 0.024초

시스템 특성함수 기반 평균보상 TD(${\lambda}$) 학습을 통한 유한용량 Fab 스케줄링 근사화 (Capacitated Fab Scheduling Approximation using Average Reward TD(${\lambda}$) Learning based on System Feature Functions)

  • 최진영
    • 산업경영시스템학회지
    • /
    • 제34권4호
    • /
    • pp.189-196
    • /
    • 2011
  • In this paper, we propose a logical control-based actor-critic algorithm as an efficient approach for the approximation of the capacitated fab scheduling problem. We apply the average reward temporal-difference learning method for estimating the relative value functions of system states, while avoiding deadlock situation by Banker's algorithm. We consider the Intel mini-fab re-entrant line for the evaluation of the suggested algorithm and perform a numerical experiment by generating some sample system configurations randomly. We show that the suggested method has a prominent performance compared to other well-known heuristics.

Affective Decision-Making among Preschool Children in Diverse Cultural Contexts

  • Qu, Li;Shan, Gao;Yip, Cindy;Li, Hong;Zelazo, Philip David
    • Child Studies in Asia-Pacific Contexts
    • /
    • 제2권2호
    • /
    • pp.123-132
    • /
    • 2012
  • The current study examined 3- and 4-year-olds' affective decision-making in a variety of cultural contexts by comparing European Canadian children to Chinese Canadian, Hong Kong Chinese, and mainland Chinese children (N = 245). All children were tested with a delay of gratification task in which children chose between an immediate reward of lower value and a delayed reward of higher value. Results showed that Chinese Canadian and Hong Kong Chinese children chose more delayed rewards than European Canadian children, with mainland Chinese children showing a trend toward more delayed rewards. Across cultures, 4-year-olds chose more delayed rewards than 3-year-olds; and among 4-year-olds, girls made more such choices than boys. The findings are consistent with previous findings that exposure to Chinese culture is associated with better cool executive function, but they also highlight the importance of examining development across diverse cultural contexts.

Weight Adjustment Scheme Based on Hop Count in Q-routing for Software Defined Networks-enabled Wireless Sensor Networks

  • Godfrey, Daniel;Jang, Jinsoo;Kim, Ki-Il
    • Journal of information and communication convergence engineering
    • /
    • 제20권1호
    • /
    • pp.22-30
    • /
    • 2022
  • The reinforcement learning algorithm has proven its potential in solving sequential decision-making problems under uncertainties, such as finding paths to route data packets in wireless sensor networks. With reinforcement learning, the computation of the optimum path requires careful definition of the so-called reward function, which is defined as a linear function that aggregates multiple objective functions into a single objective to compute a numerical value (reward) to be maximized. In a typical defined linear reward function, the multiple objectives to be optimized are integrated in the form of a weighted sum with fixed weighting factors for all learning agents. This study proposes a reinforcement learning -based routing protocol for wireless sensor network, where different learning agents prioritize different objective goals by assigning weighting factors to the aggregated objectives of the reward function. We assign appropriate weighting factors to the objectives in the reward function of a sensor node according to its hop-count distance to the sink node. We expect this approach to enhance the effectiveness of multi-objective reinforcement learning for wireless sensor networks with a balanced trade-off among competing parameters. Furthermore, we propose SDN (Software Defined Networks) architecture with multiple controllers for constant network monitoring to allow learning agents to adapt according to the dynamics of the network conditions. Simulation results show that our proposed scheme enhances the performance of wireless sensor network under varied conditions, such as the node density and traffic intensity, with a good trade-off among competing performance metrics.

고속도로 자율주행 시 보상을 최대화하기 위한 강화 학습 활성화 함수 비교 (Comparison of Reinforcement Learning Activation Functions to Maximize Rewards in Autonomous Highway Driving)

  • 이동철
    • 한국인터넷방송통신학회논문지
    • /
    • 제22권5호
    • /
    • pp.63-68
    • /
    • 2022
  • 자율주행 기술은 최근 심층 강화학습의 도입으로 큰 발전을 이루고 있다. 심층 강화 학습을 효과적으로 사용하기 위해서는 적절한 활성화 함수를 선택하는 것이 중요하다. 그 동안 많은 활성화 함수가 제시되었으나 적용할 환경에 따라 다른 성능을 보여주었다. 본 논문은 고속도로에서 자율주행을 학습하기 위해 강화 학습을 사용할 때 어떤 활성화 함수를 사용하는 것이 효과적인지 12개의 활성화 함수 성능을 비교 평가한다. 이를 위한 성능 평가 방법을 제시하였고 각 활성화 함수의 평균 보상 값을 비교하였다. 그 결과 GELU를 사용할 경우 가장 높은 평균 보상을 얻을 수 있었으며 SiLU는 가장 낮은 성능을 보여주었다. 두 활성화 함수의 평균 보상 차이는 20%였다.

방사선과 학생의 직업가치관 유형에 관한 연구 (A Study on the Types of Work Values of Radiologic Technology Students)

  • 김학성
    • 대한방사선기술학회지:방사선기술과학
    • /
    • 제30권3호
    • /
    • pp.271-280
    • /
    • 2007
  • 이 연구의 목적은 방사선과 학생들의 직업가치관 유형을 분석하여 그 특성을 규명하는데 있으며 전국에 소재한 7개 대학 방사선과 학생 791명을 대상으로 자료를 수집 분석하여 직업가치관 유형을 추출하고, 직업가치관 특성을 알아보기 위하여 관련 변인에 대한 차이를 검증하였다. 이 연구에서 얻어진 결론은 다음과 같다. 1. 방사선과 학생들의 외재적 직업가치관(지위, 경제적보상)은 내재적 직업가치관(사회기여, 성취, 능력, 직업흥미)보다 높게 나타났다. 2. 방사선과 학생들이 중요하게 생각하는 직업가치관 유형은 경제적 보상, 지위, 성취, 능력, 직업흥미, 사회기여 순으로 나타났다.

  • PDF

의사집단의 전문직 사회화 과정과 사회적 지위 만족도, 경제적 보상 만족도 그리고 주관적 계층인식과의 관련성 (Job Satisfaction, Subjective Class Identification and Associated Factors of Professional Socialization in Korean Physicians)

  • 윤형곤;황인경;문영배;이희영;윤석준
    • Journal of Preventive Medicine and Public Health
    • /
    • 제41권1호
    • /
    • pp.30-38
    • /
    • 2008
  • Objectives : The aim of this study was to determine the relationship between the core properties of professional socialization and social status satisfaction, economic reward satisfaction, and subjective class identification. Methods : Medical knowledge and skill, autonomy, and professional value factors were used as essential properties of professional socialization to determine the association with job satisfaction and subjective class identification. The authors used a self-administered questionnaire survey and collected nationwide data between July and August 2003, with 211 responses used for final analysis. Results : 'Age' and 'trust and respect' were positively associated with social status satisfaction, and 'occupation' was negatively associated. 'Income' and 'trust and respect' were positively related to economic reward satisfaction, and 'practicing for oneself', and 'a sense of duty and attendance' were negatively related. 'Practicing for oneself', 'not believing explanations', and 'a sense of duty and attendance' had a positive relationship with subjective class identification. 'Income', 'knowledge system', 'medical mistakes', 'treating like goods', 'meaning and joy', and 'trust and respect' had a negative relationship. Conclusions : The core property variables of professional socialization had a different relationship with social status satisfaction, economic reward satisfaction and subjective class identification. In particular, many core property variables were associated with subjective class identification positively or negatively. The development of professional socialization would help promote job satisfaction and subjective class identification.

재투자가 있는 잉여금 과정의 최적 운용정책 (An optimal management policy for the surplus process with investments)

  • 임세진;최승경;이의용
    • 응용통계연구
    • /
    • 제29권7호
    • /
    • pp.1165-1172
    • /
    • 2016
  • 보험 상품의 잉여금은 보험료 수입에 의해 증가하며 고객이 보험료를 청구할 때 감소한다. 보험회사는 잉여금이 충분히 많아지면 잉여금의 일부를 재투자하는 것을 통해 이익을 창출한다. 본 연구에서는 보험료 수입과 청구를 고려하여 잉여금의 수준을 나타낸 기존의 잉여금 모형을 소개하고 기존의 모형에 재투자의 개념과 운용비용을 도입하여 장시간에 걸친 단위시간당 평균비용을 구하고, 이를 최소화하는 재투자 수준과 목표 잉여금을 구한다.

Designing an Effective Pay-for-performance System in the Korean National Health Insurance

  • Jeong, Hyoung-Sun
    • Journal of Preventive Medicine and Public Health
    • /
    • 제45권3호
    • /
    • pp.127-136
    • /
    • 2012
  • The challenge facing the Korean National Health Insurance includes what to spend money on in order to elevate the 'value for money.' This article reviewed the changing issues associated with quality of care in the Korean health insurance system and envisioned a picture of an effective pay-for-performance (P4P) system in Korea taking into consideration quality of care and P4P systems in other countries. A review was made of existing systematic reviews and a recent Organization for Economic Cooperation and Development survey. An effective P4P in Korea was envisioned as containing three features: measures, basis for reward, and reward. The first priority is to develop proper measures for both efficiency and quality. For further improvement of quality indicators, an electronic system for patient history records should be built in the near future. A change in the level or the relative ranking seems more desirable than using absolute level alone for incentives. To stimulate medium- and small-scale hospitals to join the program in the next phase, it is suggested that the scope of application be expanded and the level of incentives adjusted. High-quality indicators of clinical care quality should be mapped out by combining information from medical claims and information from patient registries.

신경망을 이용한 무선망에서의 채널 관리 기법 (A Channel Management Technique using Neural Networks in Wireless Networks)

  • 노철우;김경민;이광의
    • 한국정보통신학회논문지
    • /
    • 제10권6호
    • /
    • pp.1032-1037
    • /
    • 2006
  • 채널은 무선망에 있어서 한정된 주요 자원 중의 하나이다. 다양한 채널 관리 기법들이 제시되어 왔으며, 최근 들어 가드채널의 최적화 문제가 부각되고 있다. 본 논문에서는 신경망을 이용한 지능적인 채널 관리 기법을 제안한다. 신경망의 학습 데이터 생성과 성능분석을 위하여 SRN(Stochastic Reward Net) 채널 할당 모델이 개발된다. 제안된 기법에서 신경망은 지도학습 방법인 역전파 알고리즘을 이용하여 최적의 가드채널 값 g를 계산하도록 학습한다. 학습된 신경망을 이용하여 최적의 g를 계산하고, 이를 SRM모델에서 구해진 결과와 비교한다. 실험 결과는 신경망에서 구한 가드채널 수와 SRM모델로부터 구한 가드채널 수의 상대적 차이가 없음을 보여준다.

Factors Affecting Employee Performance: A Case Study of Railway Maintenance and Engineering Organizations in Thailand

  • POLANANT, Kanut;ROJNIRUTTIKUL, Nuttawut
    • The Journal of Asian Finance, Economics and Business
    • /
    • 제9권9호
    • /
    • pp.271-281
    • /
    • 2022
  • The objectives of the research are to study the effects of emotional intelligence (EI), reward management (RM), and occupational health and safety (OHS), on employee performance (EP) within a Thai motor service and repair firm. Starting in January 2022 through the end of March 2022, the researchers used simple random sampling techniques to select 88 employees for the case study. The research instrument was a questionnaire with an IOC value between 0.67-1.00 and a reliability value α of 0.78. Survey participants were asked to contribute their opinions to a five-level opinion survey which was hosted on Google Forms. Descriptive statistics analysis (mean and standard deviation) and multiple linear regression analysis were done using SPSS for Windows version 21. The results showed that employee opinions concerning EI, RM, OHS, and EP were at a high level, with the three hypotheses testing showing statistical significance (p ≤ 0.01). The decision coefficients (R2) all revealed relationship strength with RM = 0.861, OHS = 0.853, and EI = 0.731.