DOI QR코드

DOI QR Code

Dynamic Pricing Based on Reinforcement Learning Reflecting the Relationship between Driver and Passenger Using Matching Matrix

Matching Matrix를 사용하여 운전자와 승객의 관계를 반영한 강화학습 기반 유동적인 가격 책정 체계

  • Park, Jun Hyung (Dept. of Computer Engineering, Univ. of Hongik) ;
  • Lee, Chan Jae (Dept. of Artificial Intelligence.Big Data, Univ. of Hongik) ;
  • Yoon, Young (Dept. of Computer Engineering, Univ. of Hongik)
  • 박준형 (홍익대학교 정보컴퓨터공학부 컴퓨터공학전공) ;
  • 이찬재 (홍익대학교 산업융합협동과정 인공지능.빅데이터전공) ;
  • 윤영 (홍익대학교 컴퓨터공학과)
  • Received : 2020.11.11
  • Accepted : 2020.12.17
  • Published : 2020.12.31

Abstract

Research interest in the Mobility-as-a-Service (MaaS) concept for enhancing users' mobility experience is increasing. In particular, dynamic pricing techniques based on reinforcement learning have emerged since adjusting prices based on the demand is expected to help mobility services, such as taxi and car-sharing services, to gain more profit. This paper provides a simulation framework that considers more practical factors, such as demand density per location, preferred prices, the distance between users and drivers, and distance to the destination that critically affect the probability of matching between the users and the mobility service providers (e.g., drivers). The aforementioned new practical features are reflected on a data structure referred to as the Matching Matrix. Using an efficient algorithm of computing the probability of matching between the users and drivers and given a set of precisely identified high-demand locations using HDBSCAN, this study developed a better reward function that can gear the reinforcement learning process towards finding more realistic dynamic pricing policies.

최근 통합교통서비스(Mobility-as-a-Service)의 개념을 도입하여 이용자들의 이동성과 접근성을 향상시키고자 하는 연구가 진행되고 있다. 특히 카셰어링, 택시 등 에 대해 수요와 공급에 따라 지역을 구분하여 가격을 책정하는 유동적인 가격 책정 전략을 도입하여 단일 요금제가 가지는 서비스 기피 등의 문제를 해결함과 동시에 기업과 운전자들의 수익성에 긍정적인 영향을 줄 수 있을 것으로 기대되고 있다. 본 연구에서는 승객과 운전자간의 배차거리, 승객의 운행거리, 승객의 목적지에 대한 HDBSCAN 알고리즘을 통해서 정밀하게 인식된 수요 밀집지역, 승객과 운전자가 생각하는 선호가격을 고려하여 승객과 운전자의 입장에서 Matching Matrix를 생성한다. 이를 조합하고 보상에 반영하여, 강화학습이 더욱더 현실적인 유동적인 가격 책정전략을 도출할 수 있는 새로운 방법론을 제안한다.

Keywords

Acknowledgement

본 연구는 국토교통부 교통물류연구사업의 연구비지원(20TLRP-B148970-03), 산업통상자원부 재원으로 한국산업기술진흥원의 지원(P0004602, 친환경 자동차부품 클러스터 조성사업), 과학기술정보통신부의 재원으로 한국연구재단의 지원(2020R1F1A104826411)과 2020학년도 홍익대학교 학술연구진흥비 지원 하에 수행되었습니다.

References

  1. Bertsimas D. and Perakis G.(2006), Dynamic pricing: A learning approach. In Mathematical and computational models for congestion charging, Springer, Boston, MA, pp.45-79.
  2. Campello R. J., Moulavi D., Zimek A. and Sander J.(2015), "Hierarchical density estimates for data clustering, visualization, and outlier detection," ACM Transactions on Knowledge Discovery from Data(TKDD), vol. 10, no. 1, pp.1-51.
  3. Castillo J. C., Knoepfle D. and Weyl G.(2017), "Surge pricing solves the wild goose chase," In Proceedings of the 2017 ACM Conference on Economics and Computation, pp.241-242.
  4. Ester M., Kriegel H. P., Sander J. and Xu X.(1996), "A density-based algorithm for discovering clusters in large spatial databases with noise," KDD-96 Proceedings, vol. 96, no. 34, pp.226-231.
  5. Guo S., Liu Y., Xu K. and Chiu D. M.(2017), "Understanding ride-on-demand service: Demand and dynamic pricing," In 2017 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops), IEEE, pp.509-514.
  6. Haws K. L. and Bearden W. O.(2006), "Dynamic pricing and consumer fairness perceptions," Journal of Consumer Research, vol. 33, no. 3, pp.304-311. https://doi.org/10.1086/508435
  7. Karypis G., Han E. H. and Kumar V.(1999), "Chameleon: Hierarchical clustering using dynamic modeling," Computer, vol. 32, no. 8, pp.68-75. https://doi.org/10.1109/2.781637
  8. Lu A., Frazier P. and Kislev O.(2018), Surge Pricing Moves Uber's Driver Partners, Available at SSRN 3180246.
  9. Mnih V., Kavukcuoglu K., Silver D., Graves A., Antonoglou I., Wierstra D. and Riedmiller M.(2013), Playing atari with deep reinforcement learning, arXiv preprint arXiv:1312.5602.
  10. Mnih V., Kavukcuoglu K., Silver D., Rusu A. A., Veness J., Bellemare M. G. and Petersen S.(2015), "Human-level control through deep reinforcement learning," Nature, vol. 518, no. 7540, pp.529-533. https://doi.org/10.1038/nature14236
  11. Samworth R. J.(2012), "Optimal weighted nearest neighbour classifiers," The Annals of Statistics, vol. 40, no. 5, pp.2733-2763. https://doi.org/10.1214/12-AOS1049
  12. Song J., Cho Y. J., Kang M. H. and Hwang K. Y.(2020), "An Application of Reinforced Learning-Based Dynamic Pricing for Improvement of Ridesharing Platform Service in Seoul," Electronics, vol. 9, no. 11, p.1818. https://doi.org/10.3390/electronics9111818
  13. Sutton R. S. and Barto A. G.(2018), Reinforcement learning: An introduction, MIT Press.
  14. Van Otterlo M. and Wiering M.(2012), "Reinforcement learning and markov decision processes," In Reinforcement Learning, Springer, Berlin, Heidelberg, pp.3-42.
  15. Wu T., Joseph A. D. and Russell S. J.(2016), Automated pricing agents in the on-demand economy, University of California.

Cited by

  1. Zone-Agnostic Greedy Taxi Dispatch Algorithm Based on Contextual Matching Matrix for Efficient Maximization of Revenue and Profit vol.10, pp.21, 2020, https://doi.org/10.3390/electronics10212653