DOI QR코드

DOI QR Code

Node2vec 그래프 임베딩과 Light GBM 링크 예측을 활용한 식음료 산업의 수출 후보국가 탐색 연구

A Study on Searching for Export Candidate Countries of the Korean Food and Beverage Industry Using Node2vec Graph Embedding and Light GBM Link Prediction

  • 이재성 (과학기술연합대학원대학교 과학기술경영정책) ;
  • 전승표 (한국과학기술정보연구원 데이터분석본부 / 과학기술연합대학원대학교 과학기술경영정책) ;
  • 서진이 (한국과학기술정보연구원 데이터분석본부)
  • Lee, Jae-Seong (University of Science & Technology) ;
  • Jun, Seung-Pyo (Korea Institute of Science and Technology Information / University of Science & Technology) ;
  • Seo, Jinny (Korea Institute of Science and Technology Information)
  • 투고 : 2021.10.30
  • 심사 : 2021.11.30
  • 발행 : 2021.12.31

초록

본 연구는 Node2vec 그래프 임베딩 방법과 Light GBM 링크 예측을 활용해 우리나라 식음료 산업의 미개척 수출 후보국가를 탐색한다. Node2vec은 네트워크의 공통 이웃 개수 등을 기반으로 하는 기존의 링크 예측 방법에 비해 상대적으로 취약하다고 알려져 있던 네트워크의 구조적 등위성 표현의 한계를 개선한 방법이다. 따라서 해당 방법은 네트워크의 커뮤니티 탐지와 구조적 등위성 모두에서 우수한 성능을 나타내는 것으로 알려져 있다. 이에 본 연구는 이상의 방법을 우리나라 식음료 산업의 국제 무역거래 정보에 적용했다. 이를 통해 해당 산업의 글로벌 가치사슬 관계에서 우리나라의 광범위한 마진 다각화 효과를 창출하는데 기여하고자 한다. 본 연구의 결과를 통해 도출된 최적의 예측 모델은 0.95의 정밀도와 0.79의 재현율을 기록하며 0.86의 F1 score를 기록해 우수한 성능을 나타냈다. 이상의 모델을 통해 도출한 우리나라의 잠재적 수출 후보국가들의 결과는 추가 조사를 통해 대부분 적절하게 나타난 것을 알 수 있었다. 이상의 내용을 종합하여 본 연구는 Node2vec과 Light GBM을 응용한 링크 예측 방법의 실무적 활용성에 대해 시사할 수 있었다. 그리고 모델을 학습하며 링크 예측을 보다 잘 수행할 수 있는 가중치 업데이트 전략에 대해서도 유용한 시사점을 도출할 수 있었다. 한편, 본 연구는 그래프 임베딩 기반의 링크 예측 관련 연구에서 아직까지 많이 수행된 적 없는 무역거래에 이를 적용했기에 정책적 활용성도 갖고 있다. 본 연구의 결과는 최근 미중 무역갈등이나 일본 수출 규제 등과 같은 글로벌 가치사슬의 변화에 대한 빠른 대응을 지원하며 정책적 의사결정을 위한 도구로써 충분한 유용성이 있다고 생각한다.

This study uses Node2vec graph embedding method and Light GBM link prediction to explore undeveloped export candidate countries in Korea's food and beverage industry. Node2vec is the method that improves the limit of the structural equivalence representation of the network, which is known to be relatively weak compared to the existing link prediction method based on the number of common neighbors of the network. Therefore, the method is known to show excellent performance in both community detection and structural equivalence of the network. The vector value obtained by embedding the network in this way operates under the condition of a constant length from an arbitrarily designated starting point node. Therefore, it has the advantage that it is easy to apply the sequence of nodes as an input value to the model for downstream tasks such as Logistic Regression, Support Vector Machine, and Random Forest. Based on these features of the Node2vec graph embedding method, this study applied the above method to the international trade information of the Korean food and beverage industry. Through this, we intend to contribute to creating the effect of extensive margin diversification in Korea in the global value chain relationship of the industry. The optimal predictive model derived from the results of this study recorded a precision of 0.95 and a recall of 0.79, and an F1 score of 0.86, showing excellent performance. This performance was shown to be superior to that of the binary classifier based on Logistic Regression set as the baseline model. In the baseline model, a precision of 0.95 and a recall of 0.73 were recorded, and an F1 score of 0.83 was recorded. In addition, the light GBM-based optimal prediction model derived from this study showed superior performance than the link prediction model of previous studies, which is set as a benchmarking model in this study. The predictive model of the previous study recorded only a recall rate of 0.75, but the proposed model of this study showed better performance which recall rate is 0.79. The difference in the performance of the prediction results between benchmarking model and this study model is due to the model learning strategy. In this study, groups were classified by the trade value scale, and prediction models were trained differently for these groups. Specific methods are (1) a method of randomly masking and learning a model for all trades without setting specific conditions for trade value, (2) arbitrarily masking a part of the trades with an average trade value or higher and using the model method, and (3) a method of arbitrarily masking some of the trades with the top 25% or higher trade value and learning the model. As a result of the experiment, it was confirmed that the performance of the model trained by randomly masking some of the trades with the above-average trade value in this method was the best and appeared stably. It was found that most of the results of potential export candidates for Korea derived through the above model appeared appropriate through additional investigation. Combining the above, this study could suggest the practical utility of the link prediction method applying Node2vec and Light GBM. In addition, useful implications could be derived for weight update strategies that can perform better link prediction while training the model. On the other hand, this study also has policy utility because it is applied to trade transactions that have not been performed much in the research related to link prediction based on graph embedding. The results of this study support a rapid response to changes in the global value chain such as the recent US-China trade conflict or Japan's export regulations, and I think that it has sufficient usefulness as a tool for policy decision-making.

키워드

과제정보

이 연구는 2021년도 산업통상자원부 및 산업기술평가관리원(KEIT) 연구비 지원에 의한 연구임(20009398)

참고문헌

  1. A. Al-Mudimigh, et al., "Extending the Concept of Supply Chain: The Effective Management of Value Chains," International Journal of Production Economics, Vol.87, No.3(2004), 309~320. https://doi.org/10.1016/j.ijpe.2003.08.004
  2. Abreha, K. G., et al., "Coping with the Crisis and Export Diversification," The World Economy, Vol.43, No.5(2020), 1452~1481. https://doi.org/10.1111/twec.12937
  3. Allan, J., "Virtual water-the water, food, and trade nexus, Useful concept or misleading metaphor?," Water international, Vol.28, No.1(2003), 106~113. https://doi.org/10.1080/02508060.2003.9724812
  4. Antonelli, M., and M. Sartori, "Unfolding the potential of the virtual water concept. What is still under debate?," Environmental Science & Policy, Vol.50(2015), 240~251. https://doi.org/10.1016/j.envsci.2015.02.011
  5. Batagelj, Vladimir, and Andrej Mrvar, "A Subquadratic Triad Census Algorithm for Large Sparse Networks with Small Maximum Degree," Social Networks, Vol.23, No.3(2001), 237~243. https://doi.org/10.1016/S0378-8733(01)00035-1
  6. Cadot, O., et al., Export Diversification: What's Behind the Hump?, 2007.
  7. CEPAL, INTERNATIONAL TRADE IN GOODS IN LATIN AMERICA AND THE CARIBBEAN. 2020, https://www.cepal.org/sites/default/files/publication/files/46518/Boletin_40_ingles.pdf.
  8. CEPAL, The Effects of the Coronavirus Disease (COVID-19) Pandemic on International Trade and Logistics. 2020, https://www.cepal.org/sites/default/files/publication/files/45878/S2000496_en.pdf.
  9. Chapagain, A., Hoekstra, A., and H. Savenije, "Water saving through international trade of agricultural products," Hydrology and Earth System Sciences, Vol.10, No.3(2006), 455~468. https://doi.org/10.5194/hess-10-455-2006
  10. Christopher, M., Logistics and Supply Chain Management: Creating Value-Added Networks. Pearson Education, 2005.
  11. Collier, Paul, and Anthony J. Venables, "Rethinking Trade Preferences: How Africa Can Diversify Its Exports," The World Economy, Vol.30, No.8(2007), https://doi.org/10.1111/j.1467-9701.2007.01042.x.
  12. Cox, Andrew, "Power, Value and Supply Chain Management," Supply Chain Management: An International Journal, (1999).
  13. da Costa Neto, M. N. C., and R. Romeu, Did Export Diversification Soften the Impact of the Global Financial Crisis?, 2011.
  14. de Marchi, V. D., et al., "Environmental Strategies, Upgrading and Competitive Advantage in Global Value Chains," Business Strategy and the Environment, Vol.22, No.1(2013), 62~72. https://doi.org/10.1002/bse.1738
  15. Dennis, A., and B. Shepherd. "Trade Facilitation and Export Diversification." The World Economy, vol. 34, no. 1, 2011, pp. 101-22. https://doi.org/10.1111/j.1467-9701.2010.01303.x
  16. di Domenico, C., et al., "Supply Chain Management Analysis: A Simulation Approach of the Value Chain Operations Reference Model (VCOR)," Advances in Production Management Systems, (2007), 257~264.
  17. Doroud, M., et al., "The Evolution of Ego-Centric Triads: A Microscopic Approach toward Predicting Macroscopic Network Properties," 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third International Conference on Social Computing, (2011), 172~179.
  18. El Hag, S., and M. El Shazly, "Oil Dependency, Export Diversification and Economic Growth in the Arab Gulf States," European Journal of Social Sciences, Vol.29, No.3(2012), 397~404.
  19. Etoday, "[Planting K-Agriculture in the World ①] Agricultural exports soaring thanks to FTA... 'now pioneering the New Northern Region'", 2020, https://www.etoday.co.kr/news/view/1897990.
  20. Falkenmark, M., Rockstrom, J., and J. Roststrom, Balancing water for humans and nature: the new approach in ecohydrology, Earthscan, 2004.
  21. Gereffi, G., and K. Fernandez-Stark, Global Value Chain Analysis: A Primer, Duke University, 2011.
  22. Gereffi, G., and J. Lee, "Why the World Suddenly Cares about Global Supply Chains," Journal of Supply Chain Management, Vol.48, No.3 (2012), 24~32. https://doi.org/10.1111/j.1745-493X.2012.03271.x
  23. Giroud, A., and H. Mirza, "Refining of FDI Motivations by Integrating Global Value Chains' Considerations," The Multinational Business Review, (2015).
  24. Global Value Chains Center, (2011), https://Globalvaluechains.Org.
  25. Global Economy, "Latin America, Economic Recovery," 2021, https://news.g-enews.com/view.php?ud=20210930221224191102_7&ssk=g080000&md=20211007000005_R.
  26. Godfray, H., Beddington, J., Crute, I. R., Haddad, L., Lawrence, D., Muir, J. F., Pretty, J., Robinson, S., Thomas, S. M., and C. Toulmin, "Food security: the challenge of feeding 9 billion people," Science, Vol.327, No.5967(2010), 812~818. https://doi.org/10.1126/science.1185383
  27. Grover, Aditya, and Jure Leskovec. "Node2vec: Scalable Feature Learning for Networks," Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (2016), 855~864.
  28. Hao, Wei, et al., "Bond Transaction Link Prediction Based on Dynamic Network Embedding and Time Series Analysis," 2019 6th International Conference on Systems and Informatics, ICSAI 2019, Vol. Icsai, (2019), 1471~1477, https://doi.org/10.1109/ICSAI48974.2019.9010471.
  29. Hwang Seo-young. "'Aloe Beverage' is attacking Central American markets such as Panama." Food and Beverage Newspaper, 25 Apr. 2018.
  30. Herzer, D., and D. F. Nowak-Lehnmann, "What Does Export Diversification Do for Growth?: An Econometric Analysis," Applied Economics, Vol.38, No.15(2006), 1825~1838. https://doi.org/10.1080/00036840500426983
  31. Hesse, H., "Breaking into New Markets: Emerging Lessons for Export Diversification," Export Diversification and Economic Growth, (2009), 55~80.
  32. Ke, Guolin, et al., "LightGBM: A Highly Efficient Gradient Boosting Decision Tree," Advances in Neural Information Processing Systems, Vol.30(2017), 3146~3154.
  33. Kim Sung-Hoon, et al., Analysis of the impact of FTA negotiations on processed food and the food and restaurant industry, 2018.
  34. Min Sunghwan, et al., Analysis of export diversification patterns of Korean industry, 2011.
  35. Mudambi, R., and J. Puck, "A Global Value Chain Analysis of the 'Regional Strategy' Perspective," Journal of Management Studies, Vol.53, No.6(2016), 1076~1093. https://doi.org/10.1111/joms.12189
  36. Oki, T., and S. Kanae, "Virtual water trade and world water resources," Water Science and Technology, Vol.49, No.7(2004), 203~209. https://doi.org/10.2166/wst.2004.0456
  37. Ortmann, Mark, and Ulrik Brandes, "Efficient Orbit-Aware Triad and Quad Census in Directed and Undirected Graphs," Applied Network Science, Vol.2, No.1(2017), 1~17. https://doi.org/10.1007/s41109-016-0020-1
  38. Park Kang-wook. "The economic growth of Latin America and the Caribbean is expected to be 3.7%." KOTRA Overseas Market News, 2021, https://news.kotra.or.kr/user/globalAllBbs/kotranews/album/2/globalBbsDataAllView.do?dataIdx=186882.
  39. Patel, Rushabh, and Yanhui Guo, "Graph Based Link Prediction between Human Phenotypes and Genes," ArXiv, (2021), 1~13.
  40. Perozzi, Bryan, et al., "Deepwalk: Online Learning of Social Representations," Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (2014), 701~710.
  41. Stabell, C. B., and O. D. Fjeldstad, "Configuring Value for Competitive Advantage: On Chains, Shops, and Networks," Strategic Management Journal, Vol.19, No.5(1998), 413~437. https://doi.org/10.1002/(SICI)1097-0266(199805)19:5<413::AID-SMJ946>3.0.CO;2-C
  42. Tilman, D., Balzer, C., Hill, J., and B. L., Befort, "Global food demand and the sustainable intensification of agriculture," Proceedings of the national academy of sciences, (2011), 20260~20264.
  43. Tran, Thi Anh Dao, et al., "Global Value Chains and the Missing Link between Exchange Rates and Export Diversification," International Economics, Vol.164(2020), 194~205, https://doi.org/10.1016/j.inteco.2020.10.001.
  44. Tuninetti, Marta, et al., "To Trade or Not to Trade: Link Prediction in the Virtual Water Network," Advances in Water Resources, Vol. 110(2017), 528~537, https://doi.org/10.1016/j.advwatres.2016.08.013.
  45. Uddin, S., et al., "Triad Census and Subgroup Analysis of Patient-Sharing Physician Collaborations," IEEE Access, Vol.6(2018), 72233~72240. https://doi.org/10.1109/access.2018.2880514
  46. UN Trade Statistics, "What Is UN Comtrade?," 2006, https://unstats.un.org/unsd/tradekb/knowledgebase/50075/what-is-un-comtrade.
  47. Yang, H., Wang, L., Abbaspour, K., and A. Zehnder, "Virtual water trade: an assenssment of water use efficiency in the international food trade," Hydrology and Earth System Sciences, Vol.10, No.3(2006), 443~454. https://doi.org/10.5194/hess-10-443-2006
  48. Yeo Take-dong, and Ki Seok-do, "Korea's Trade Strategy and Promising Export Items in the Visegrad Group Market," E-Trade Research, Vol.18, No.1(2020), 141~166.