DOI QR코드

DOI QR Code

Advanced CBS (Cost Breakdown Structure) Code Search Technology Applying NLP (Natural Language Processing) of Artificial Intelligence

인공지능 자연어 처리 기법을 이용한 개선된 내역코드 탐색방법

  • Kim, HanDo (Koryo Software Inc) ;
  • Nam, JeongYong (Koryo Software Inc)
  • 김한도 (고려소프트웨어) ;
  • 남정용 (고려소프트웨어)
  • Received : 2024.08.07
  • Accepted : 2024.09.30
  • Published : 2024.10.01

Abstract

For efficient construction management, linking BIM with schedule and cost is essential, but there are limits to the application of 5D BIM due to the difficulty in disassembling thousands of WBS and CBS. To solve this problem, a standardized WBS-CBS set is configured in advance, and when a new construction project occurs, the CBS in the BOQ is automatically linked to the WBS when a text most similar to it is found among the standard CBS (Public Procurement Service standard construction code) of the already linked set. A method was used to compare the text similarity of CBS more efficiently using artificial intelligence natural language processing techniques. Firstly, we created a civil term dictionary (CTD) that organized the words used in civil projects and assigned numerical values, tokenized the text of all CBS into words defined in the dictionary, converted them into TF-IDF vectors, and determined them by cosine similarity. Additionally, the search success rate increased to nearly 70 % by considering CBS' hierarchical structure and changing keywords. The threshold value for judging similarity was 0.62 (1: perfect match, 0: no match).

효율적인 공사관리를 위해 BIM과 공정 및 내역의 연계가 필수인데 수천 개의 WBS와 CBS의 분개가 어려워 5D BIM 적용에 한계가 있다. 이를 해결하기 위해 표준화된 WBS-CBS 셋트를 미리 구성하고 신규 공사 발생시 내역서의 CBS를 기(旣)연계된 셋트의 표준CBS(조달청 표준공사코드) 중에서 텍스트가 가장 유사한 것을 찾아내면 자동으로 WBS와 연계되는 방법을 사용하게 되는데, 인공지능 자연어 처리기법으로 보다 효율적으로 CBS의 텍스트 유사도를 비교하였다. 먼저 토목공사에서 사용되는 단어를 정리하고 수치를 부여한 토목용어사전을 생성하고 모든 내역의 텍스트를 사전(辭典)에 정의된 단어로 토크나이징하고 이를 TF-IDF벡터로 변환하여 코사인 유사도로써 판별하였다. 추가적으로 CBS의 계층구조 고려나 키워드 변경 등의 적용으로 약 70 % 가까이 검색 성공율이 높아졌다. 유사도 판단 기준값은 0.62로 나타났다(1: 완전일치, 0: 일치없음).

Keywords

Acknowledgement

The research was supported by MOLIT (Ministry of Land, Infrastructure and Transport) and KAIA (Korea Agency for Infrastructure Technology Advancement), through 'Smart Construction Technology Development (RS-2020-KA158708)' led by Korea Expressway Corporation.

References

  1. Institute for Information & Communications Technology Promotion (IITP) (2021). Digitalizing construction project requirements using artificial intelligence and natural language processing (in Korean).
  2. Jang, Y., Choi, J., Park, S., Kang, Y., Kang, H. and Kim, H. (2021). "Movie corpus emotional analysis using emotion vocabulary dictionary." Proc. of the 33rd Annual Conference on Human and Cognitive Language Technology, pp. 379-383.
  3. Jang, S., Kim, H., Kim, S., Choi, W., Jeong, J. and Lee, Y. (2022). "Development of online fashion thesaurus and taxonomy for text mining." Journal of the Korean Society of Clothing and Textiles, KSCT, Vol. 46, No. 6, pp. 1142-1160, https://doi.org/10.5850/JKSCT.2022.46.6.1142.
  4. Kim, S., Cha, G., Cho, M. and Park, S. (2022a). "Text mining based analysis of construction accident causes and risk factors." Proc. of the 2022 Spring Conference of the Korea Academia-Industrial Cooperation Society, pp. 272-273.
  5. Korea Institute of Civil Engineering and Building Technology (KICT) (2017). WBS (Work Breakdown Structure) List, Available at: https://www.kict.re.kr/board.es?mid=a10501040000&bid=archv&tag=&act=view&list_no=12369 (Accessed: July 25, 2024).
  6. Kim, H., Nam, J., Kim, Y. and Ryu, I. (2023a). "Methods for quantitative disassembly and code establishment of CBS in BIM for program and payment management." Journal of the Computational Structural Engineering Institute of Korea, COSEIK, Vol. 36, No. 6, pp. 381-389, https://doi.org/10.7734/COSEIK.2023.36.6.381.
  7. Kim, Y., Park, K., Choi, S., Jang, Y., Yeom, Y., Lee, B. and Shin, H. (2023b). Apparatus and method for analysis of transaction brief data using corpus for machine learning based on financial MyData and computer program for the same, Available at: https://doi.org/10.8080/1020220069779 (Accessed: July 25, 2024).
  8. Kim, D., Lee, D., Park, J., Oh, S., Kwon, S., Lee, I. and Choi, D. (2022b). "KB-BERT: Training and application of Korean pre-trained language model in financial domain." Journal of Intelligence and Information Systems, KIISS, Vol. 28, No. 2, pp. 191-206, http://dx.doi.org/10.13088/jiis.2022.28.2.191.
  9. Lane, H., Hapske, H., Howard, C. and Ryu, K. (translator) (2020). Natural language processing in action, JayPub (Manning) pp. 109 (in Korean).
  10. Nam, J., Jo, C. and Park, S. (2017). "A study on applying information framework for BIM based WBS - Focusing on civil construction." Journal of the Korea Academia-Industrial Cooperation Society, KAIS, Vol. 18, No. 11, pp. 770-777, https://doi.org/10.5762/KAIS.2017.18.11.770.
  11. Park, K. and Kim, H. (2021). "Analysis of seasonal importance of construction hazards using text mining." Journal of Civil and Environmental Engineering Research, KSCE, Vol. 41, No. 3, pp. 305-316, https://doi.org/10.12652/Ksce.2021.41.3.0305.
  12. Park, H. and Lee, B. (2011). "EVMS database system Implementation for interworking of WBS & CBS based management in construction works." Journal of the Korea Academia-Industrial Cooperation Society, KAIS, Vol. 12, No. 6, pp. 2851-2858, https://doi.org/10.5762/KAIS.2011.12.6.2851.
  13. PCCES (Public Construction Cost Estimation System) (2024). Standard Construction Code, Available at: https://npccs.g2b.go.kr (Accessed: January 2, 2024).
  14. Shim, M., Park, C., Hur, Y. and Lim, H. (2021). "Con-Talky: Information extraction and visualization platform for communication of construction industry." Proc. of the 33rd Annual Conference on Human and Cognitive Language Technology, pp. 476-481.
  15. Yu, W. and Ahn, S. (2024). Introduction to NLP using deep learning, Available at: https://wikidocs.net/22650, https://wikidocs.net/24559, https://wikidocs.net/31698 (Accessed: July 25, 2024).