Application Study of Reinforcement Learning Control for Building HVAC System

  • Cho, Sung-Hwan (Department of Mechanical & Automotive Engineering, JeonJu University)
  • Published : 2006.12.10


Recently, a technology based on the proportional integral (PI) control have grown rapidly owing to the needs for the robust capacity of the controllers from industrial building sectors. However, PI controller generally requires tuning of gains for optimal control when the outside weather condition changes. The present study presents the possibility of reinforcement learning (RL) control algorithm with PI controller adapted in the HVAC system. The optimal design criteria of RL controller was proposed in the environment chamber experiment and a theoretical analysis was also conducted using TRNSYS program.



  1. Ministry of Commerce, Industry and Energy, 2003, Total energy consumption report, pp. 1-80
  2. Virk, G. S. and Loveday, D.L., 1992, A comparison of predictive, PID, and on/off techniques for energy management and control, Proceedings of ASHRAE, pp. 3-10
  3. Hang, C. C. and Astrom, K.J. and Ho, W. K., 1991, Refinements of the Ziegler-Nichols tuning formula, IEEE Proceedings Part D-Control Theory Application., Vol. 138, No. 2, pp. 111-118
  4. Watkins, C. and Dayan, P., 1992, Technical note: Q-learning, Machine Learning, Vol. 8, pp. 279-292
  5. Anderson, C. W., Hittle, D. C., Katz, A. D. and Kretchmar, R. M., 1997, Synthesis of reinforcement learning, neural networks, and PI control applied to a simulated heating coil, Artificial Intelligence in Engineering, Vol. 11, No. 4, pp. 421-429
  6. Anderson, C. W., 1993, Q-Iearning with hidden-unit restarting, Advances in Neural Information Processing Systems, Vol. 5, Hanson, S. J., Cowan, J. D. and Giles, C. L., eds., Morgan Kaufmann Publishers, San Mateo, CA, pp. 81-88
  7. Barto, A. G., Bradtke, S. J. and Singh, S. P., 1995, Learning to act using real-time dynamic programming, Artificial Intelligence, Special Volume: Computational Research on Interaction and Agency, Vol. 72, No. 1, pp. 81-138
  8. Sutton, R. S, 1988, .Leaming to predict by the method of temporal difference, Machine Learning, Vol. 9, pp. 9-44
  9. Sutton, R. S. and Barto, A. G., 1998, Reinforcement Learning: an Introduction, Cambridge, MA, MIT Press, pp. 51-85
  10. So, J. H., Cho, S. H., Song, M. H. and Park, M. S., 2001, Experimental study on control performance of reinforcement learning method, Proceedings of the SAREK, pp. 697-701