DOI QR코드

DOI QR Code

Optimizing E-Commerce with Ensemble Learning and Iterative Clustering for Superior Product Selection

  • Yuchen Liu (Department of Computing, Xi'an Jiaotong-Liverpool University) ;
  • Meng Wang (Department of Computing, Xi'an Jiaotong-Liverpool University) ;
  • Gangmin Li (HeXie Management Research Centre, College of Industry-Entrepreneurs (CIE), Xi'an Jiaotong-Liverpool University) ;
  • Terry R. Payne (Department of Computer Science, University of Liverpool) ;
  • Yong Yue (Department of Computing, Xi'an Jiaotong-Liverpool University) ;
  • Ka Lok Man (Department of Computing, Xi'an Jiaotong-Liverpool University)
  • Received : 2024.02.21
  • Accepted : 2024.09.26
  • Published : 2024.10.31

Abstract

With the continuous growth of e-commerce sales, a robust product selection model is essential to maintain competitiveness and meet consumer demand. Current research primarily focuses on single models for sales prediction and lacks an integrated approach to sales forecasting and product selection. This paper proposes a comprehensive framework (VN-CPC) that combines sales forecasting with product selection to address these issues. We integrate a series of classical machine learning models, including Tree Models (XGBoost, LightGBM, CatBoost), Support Vector Machine (SVM), Bayesian Ridge, and Artificial Neural Networks (ANN), using a voting mechanism to determine the optimal weighting scheme. Our method demonstrates a lower Root Mean Square Error (RMSE) on collected Amazon data than individual models and other ensemble models. Furthermore, we employ a three-tiered clustering model: Initial Clustering, Refinement Clustering, and Final Clustering, based on our predictive model to refine product selection to specific categories. This integrated forecasting and selection framework can be more effectively applied in the dynamic e-commerce environment. It provides a robust tool for businesses to optimize their product offerings and stay ahead in a competitive market.

Keywords

Acknowledgement

This work is partially supported by the XJTLU AI University Research Centre and Jiangsu Province Engineering Research Centre of Data Science and Cognitive Computation at XJTLU and SIP AI innovation platform (YZCXPT2022103). Also, it is partially funded by the Suzhou Municipal Key Laboratory for Intelligent Virtual Engineering (SZS2022004) as well as funding: XJTLU Key Program Special Fund (KSF-A-17).

References

  1. R. Picciotto, Black Friday shoppers spent a record $9.8 billion in U.S. online sales, up 7.5% from last year, Nov. 25, 2023. [Online]. Available: https://www.cnbc.com/2023/11/25/black-fridayshoppers-spent-a-record-9point8-billion-in-us-online-sales-up-7point5percent-from-last-year.htm
  2. Y. Liu, K. L. Man, G. Li, T. Payne, and Y. Yue, "Dynamic Pricing Strategies on the Internet," in Proc. of International Conference on Digital Contents: AICo (AI, IoT, and Contents) Technology, 2022.
  3. S. K. Sharma, S. Chakraborti, and T. Jha, "Analysis of book sales prediction at Amazon marketplace in India: a machine learning approach," Information Systems and e-Business Management, vol.17, no.2-4, pp.261-284, 2019. https://doi.org/10.1007/s10257-019-00438-3
  4. Y. Liu, K. L. Man, G. Li, T. Payne, and Y. Yue, "Enhancing Sparse Data Performance in ECommerce Dynamic Pricing with Reinforcement Learning and Pre-Trained Learning," in Proc. of 2023 International Conference on Platform Technology and Service (PlatCon), pp.39-42, IEEE, 2023.
  5. Y. Qi, C. Li, H. Deng, M. Cai, Y. Qi, and Y. Deng, "A Deep Neural Framework for Sales Forecasting in E-Commerce," in Proc. of the 28th ACM International Conference on Information and Knowledge Management, pp.299-308, 2019.
  6. S. Neelakandan, V. Prakash, M. S. PranavKumar, and R. Balasubramaniam, "Forecasting of ECommerce System for Sale Prediction Using Deep Learning Modified Neural Networks," in Proc. of 2023 International Conference on Applied Intelligence and Sustainable Computing (ICAISC), pp.1-5, IEEE, 2023.
  7. D.-M. Petrosanu, A. Pirjan, G. Carutasu, A. Tabusca, D.-L. Zirra, and A. Perju-Mitran, "ECommerce Sales Revenues Forecasting by Means of Dynamically Designing, Developing and Validating a Directed Acyclic Graph (DAG) Network for Deep Learning," Electronics, vol.11, no.18, 2022.
  8. S. Wang and Y. Yang, "M-GAN-XGBOOST model for sales prediction and precision marketing strategy making of each product in online stores," Data Technologies and Applications, vol.55, no.5, pp.749-770, 2021. https://doi.org/10.1108/DTA-11-2020-0286
  9. Y. Liu, K. L. Man, G. Li, T. R. Payne, and Y. Yue, "Evaluating and Selecting Deep Reinforcement Learning Models for OptimalDynamic Pricing: A Systematic Comparison of PPO, DDPG, and SAC," in Proc. of the 2024 8th International Conference on Control Engineering and Artificial Intelligence, pp.215-219, 2024.
  10. G. Tsoumakas, "A survey of machine learning techniques for food sales prediction," Artificial Intelligence Review, vol.52, no.1, pp.441-447, 2019. https://doi.org/10.1007/s10462-018-9637-z
  11. K. Bandara, P. Shi, C. Bergmeir, H. Hewamalage, Q. Tran, and B. Seaman, "Sales Demand Forecast in E-commerce Using a Long Short-Term Memory Neural Network Methodology," in Proc. of Neural Information Processing: 26th International Conference, ICONIP 2019, Part III, LNTCS, vol.11955, pp.462-474, Springer, Sydney, NSW, Australia, Dec. 12-15, 2019.
  12. G. Liu, T. T. Nguyen, G. Zhao, W. Zha, J. Yang, J. Cao, M. Wu, P. Zhao, and W. Chen, "Repeat Buyer Prediction for E-Commerce," in Proc. of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.155-164, 2016.
  13. M. Li, S. Ji, and G. Liu, "Forecasting of Chinese E-Commerce Sales: An Empirical Comparison of ARIMA, Nonlinear Autoregressive Neural Network, and a Combined ARIMA-NARNN Model," Mathematical Problems in Engineering, vol.2018, no.2, pp.1-12, 2018.
  14. A. Andueza, M. A. D. Arco-Osuna, B. Fornes, R. Gonzalez-Crespo, and J. M. Martin-Alvarez, "Using the statistical machine learning models ARIMA and SARIMA to measure the impact of Covid-19 on official provincial sales of cigarettes in Spain," International Journal of Interactive Multimedia and Artificial Intelligence, vol.8, no.1, pp.73-87, 2023.
  15. L. F. Sales, A. Pereira, T. Vieira, and E. de B. Costa, "Multimodal deep neural networks for attribute prediction and applications to e-commerce catalogs enhancement," Multimedia Tools and Applications, vol.80, no.17, pp.25851-25873, 2021. https://doi.org/10.1007/s11042-021-10885-1
  16. S. Mu, Y. Wang, F. Wang, and L. Ogiela, "Transformative computing for products sales forecast based on SCIM," Applied Soft Computing, vol.109, 2021.
  17. M. J. Schneider and S. Gupta, "Forecasting sales of new and existing products using consumer reviews: A random projections approach," International Journal of Forecasting, vol.32, no.2, pp.243-256, 2016. https://doi.org/10.1016/j.ijforecast.2015.08.005
  18. M. Yang, T. Zhang, and C.-x. Wang, "The optimal e-commerce sales mode selection and information sharing strategy under demand uncertainty," Computers & Industrial Engineering, vol.162, 2021.
  19. S. Makkar and S. Jaiswal, "Predictive Analytics on E-commerce Annual Sales," in Proc. of Data Analytics and Management: ICDAM 2021, vol.1, pp.557-567, Springer, Singapore, 2022.
  20. S. Cheriyan, S. Ibrahim, S. Mohanan, and S. Treesa, "Intelligent Sales Prediction Using Machine Learning Techniques," in Proc. of 2018 International Conference on Computing, Electronics & Communications Engineering (iCCECE), pp.53-58, IEEE, 2018.
  21. A. Y. L. Chong, B. Li, E. W. T. Ngai, E. Ch'Ng, and F. Lee, "Predicting online product sales via online reviews, sentiments, and promotion strategies: A big data architecture and neural network approach," International Journal of Operations & Production Management, vol.36, no.4, pp.358-383, 2016. https://doi.org/10.1108/IJOPM-03-2015-0151
  22. B. Singh, P. Kumar, N. Sharma, and K. P. Sharma, "Sales Forecast for Amazon Sales with Time Series Modeling," in Proc. of 2020 First International Conference on Power, Control and Computing Technologies (ICPC2T), pp.38-43, IEEE, 2020.
  23. J. Chen, N. Tournois, and Q. Fu, "Price and its forecasting of Chinese cross-border E-commerce," Journal of Business & Industrial Marketing, vol.35, no.10, pp.1605-1618, 2020. https://doi.org/10.1108/JBIM-01-2019-0017
  24. I. Krasonikolakis, A. Vrechopoulos, and A. Pouloudi, "Store selection criteria and sales prediction in virtual worlds," Information & Management, vol.51, no.6, pp.641-652, 2014. https://doi.org/10.1016/j.im.2014.05.017
  25. A. A. Afifi, "Demand Forecasting of Short Life Cycle Products Using Data Mining Techniques," in Proc. of Artificial Intelligence Applications and Innovations: 16th IFIP WG 12.5 International Conference, AIAI 2020, Part I, IFIPAICT, vol.583, pp.151-162, Springer International Publishing, Neos Marmaras, Greece, Jun. 5-7, 2020.
  26. C.-H. Chen, P.-Y. Chen, and J. C.-W. Lin, "An Ensemble Classifier for Stock Trend Prediction Using Sentence-Level Chinese News Sentiment and Technical Indicators," International Journal of Interactive Multimedia and Artificial Intelligence, vol.7, no.3, pp.53-64, 2022.
  27. D. Thorleuchter and D. Van den Poel, "Predicting e-commerce company success by mining the text of its publicly-accessible website," Expert Systems with Applications, vol.39, no.17, pp.13026-13034, 2012. https://doi.org/10.1016/j.eswa.2012.05.096
  28. Y. Shi, T. Wang, and L. C. Alwan, "Analytics for Cross-Border E-Commerce: Inventory Risk Management of an Online Fashion Retailer," Decision Sciences, vol.51, no.6, pp.1347-1376, 2020. https://doi.org/10.1111/deci.12429
  29. S. Cremer and C. Loebbecke, "Selling goods on e-commerce platforms: The impact of scarcity messages," Electronic Commerce Research and Applications, vol.47, 2021.
  30. Z. Li, D. Amagata, Y. Zhang, T. Maekawa, T. Hara, K. Yonekawa, and M. Kurokawa, "HML4Rec: Hierarchical meta-learning for cold-start recommendation in flash sale e-commerce," Knowledge-Based Systems, vol.255, 2022.
  31. H. Palsson, F. Pettersson, and L. W. Hiselius, "Energy consumption in e-commerce versus conventional trade channels - Insights into packaging, the last mile, unsold products and product returns," Journal of Cleaner Production, vol.164, pp.765-778, 2017. https://doi.org/10.1016/j.jclepro.2017.06.242
  32. Q. Zhang, J. Li, and T. Xiao, "Sales manipulation strategies of competitive firms on an ecommerce platform: Beneficial or harmful?," Decision Sciences, 2023.
  33. T. Tong, X. Xu, N. Yan, and J. Xu, "Impact of different platform promotions on online sales and conversion rate: The role of business model and product line length," Decision Support Systems, vol.156, 2022.
  34. D. Cirqueira, M. Hofer, D. Nedbal, M. Helfert, and M. Bezbradica, "Customer Purchase Behavior Prediction in E-commerce: A Conceptual Framework and Research Agenda," in Proc. of 8th International Workshop on New Frontiers in Mining Complex Patterns, LNAI, vol.11948, pp.119-136, Cham: Springer International Publishing, 2020.
  35. N. Gordini and V. Veglio, "Customers churn prediction and marketing retention strategies. An application of support vector machines based on the AUC parameter-selection technique in B2B e-commerce industry," Industrial Marketing Management, vol.62, pp.100-107, 2017. https://doi.org/10.1016/j.indmarman.2016.08.003
  36. Y. Zhu, J. Li, J. He, B. L. Quanz, and A. A. Deshpande, "A Local Algorithm for Product Return Prediction in E-Commerce," in Proc. of 27th International Joint Conference on Artificial Intelligence (IJCAI 2018), pp.3718-3724, 2018.
  37. N. Chaudhuri, G. Gupta, V. Vamsi, and I. Bose, "On the platform but will they buy? Predicting customers' purchase behavior using deep learning," Decision Support Systems, vol.149, 2021.
  38. I. Valles-Perez, E. Soria-Olivas, M. Martinez-Sober, A. J. Serrano-Lopez, J. Gomez-Sanchis, and F. Mateo, "Approaching sales forecasting using recurrent neural networks and transformers," Expert Systems with Applications, vol.201, 2022.
  39. S. Liu, F. Xiao, W. Ou, and L. Si, "Cascade Ranking for Operational E-commerce Search," in Proc. of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.1557-1565, 2017.
  40. L. Peng, W. Zhang, X. Wang, and S. Liang, "Moderating effects of time pressure on the relationship between perceived value and purchase intention in social E-commerce sales promotion: Considering the impact of product involvement," Information & Management, vol.56, no.2, pp.317-328, 2019. https://doi.org/10.1016/j.im.2018.11.007
  41. W. Xu, Y. Cao, and R. Chen, "A multimodal analytics framework for product sales prediction with the reputation of anchors in live streaming e-commerce," Decision Support Systems, vol.177, 2024.
  42. Z. Mu, X. Liu, and K. Li, "Optimizing Operating Parameters of a Dual E-Commerce-Retail Sales Channel in a Closed-Loop Supply Chain," IEEE Access, vol.8, pp.180352-180369, 2020. https://doi.org/10.1109/ACCESS.2020.3023652
  43. T. Chen and C. Guestrin, "XGBoost: A Scalable Tree Boosting System," in Proc. of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.785-794, 2016.
  44. G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu, "LightGBM: A Highly Efficient Gradient Boosting Decision Tree," Advances in Neural Information Processing Systems, vol.30, 2017.
  45. L. Prokhorenkova, G. Gusev, A. Vorobev, A. V. Dorogush, and A. Gulin, "CatBoost: unbiased boosting with categorical features," Advances in Neural Information Processing Systems, vol.31, 2018.
  46. A. Kurani, P. Doshi, A. Vakharia, and M. Shah, "A Comprehensive Comparative Study of Artificial Neural Network (ANN) and Support Vector Machines (SVM) on Stock Forecasting," Annals of Data Science, vol.10, no.1, pp.183-208, 2023.
  47. A. Bedoui and N. A. Lazar, "Bayesian empirical likelihood for ridge and lasso regressions," Computational Statistics & Data Analysis, vol.145, 2020.
  48. Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, vol.521, pp.436-444, 2015. https://doi.org/10.1038/nature14539
  49. T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, "Optuna: A Next-generation Hyperparameter Optimization Framework," in Proc. of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp.2623-2631, 2019.
  50. Y. Liu, D. Mikriukov, O. C. Tjahyadi, G. Li, T. R. Payne, Y. Yue, K. Siddique, and K. L. Man, "Revolutionising Financial Portfolio Management: The Non-Stationary Transformer's Fusion of Macroeconomic Indicators and Sentiment Analysis in a Deep Reinforcement Learning Framework," Applied Sciences, vol.14, no.1, 2023.
  51. Y. Liu, G. Li, T. R. Payne, Y. Yue, and K. L. Man, "Non-Stationary Transformer Architecture: A Versatile Framework for Recommendation Systems," Electronics, vol.13, no.11, 2024.
  52. A. M. Ikotun, A. E. Ezugwu, L. Abualigah, B. Abuhaija, and J. Heming, "K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data," Information Sciences, vol.622, pp.178-210, 2023. https://doi.org/10.1016/j.ins.2022.11.139
  53. X. Dong, Z. Yu, W. Cao, Y. Shi, and Q. Ma, "A survey on ensemble learning," Frontiers of Computer Science, vol.14, pp.241-258, 2020. https://doi.org/10.1007/s11704-019-8208-z
  54. A. Mohammed and R. Kora, "A comprehensive review on ensemble deep learning: Opportunities and challenges," Journal of King Saud University - Computer and Information Sciences, vol.35, no.2, pp.757-774, 2023. https://doi.org/10.1016/j.jksuci.2023.01.014