• 제목/요약/키워드: critical experiment

검색결과 943건 처리시간 0.023초

Research on ITB Contract Terms Classification Model for Risk Management in EPC Projects: Deep Learning-Based PLM Ensemble Techniques (EPC 프로젝트의 위험 관리를 위한 ITB 문서 조항 분류 모델 연구: 딥러닝 기반 PLM 앙상블 기법 활용)

  • Hyunsang Lee;Wonseok Lee;Bogeun Jo;Heejun Lee;Sangjin Oh;Sangwoo You;Maru Nam;Hyunsik Lee
    • KIPS Transactions on Software and Data Engineering
    • /
    • 제12권11호
    • /
    • pp.471-480
    • /
    • 2023
  • The Korean construction order volume in South Korea grew significantly from 91.3 trillion won in public orders in 2013 to a total of 212 trillion won in 2021, particularly in the private sector. As the size of the domestic and overseas markets grew, the scale and complexity of EPC (Engineering, Procurement, Construction) projects increased, and risk management of project management and ITB (Invitation to Bid) documents became a critical issue. The time granted to actual construction companies in the bidding process following the EPC project award is not only limited, but also extremely challenging to review all the risk terms in the ITB document due to manpower and cost issues. Previous research attempted to categorize the risk terms in EPC contract documents and detect them based on AI, but there were limitations to practical use due to problems related to data, such as the limit of labeled data utilization and class imbalance. Therefore, this study aims to develop an AI model that can categorize the contract terms based on the FIDIC Yellow 2017(Federation Internationale Des Ingenieurs-Conseils Contract terms) standard in detail, rather than defining and classifying risk terms like previous research. A multi-text classification function is necessary because the contract terms that need to be reviewed in detail may vary depending on the scale and type of the project. To enhance the performance of the multi-text classification model, we developed the ELECTRA PLM (Pre-trained Language Model) capable of efficiently learning the context of text data from the pre-training stage, and conducted a four-step experiment to validate the performance of the model. As a result, the ensemble version of the self-developed ITB-ELECTRA model and Legal-BERT achieved the best performance with a weighted average F1-Score of 76% in the classification of 57 contract terms.

A Ranking Algorithm for Semantic Web Resources: A Class-oriented Approach (시맨틱 웹 자원의 랭킹을 위한 알고리즘: 클래스중심 접근방법)

  • Rho, Sang-Kyu;Park, Hyun-Jung;Park, Jin-Soo
    • Asia pacific journal of information systems
    • /
    • 제17권4호
    • /
    • pp.31-59
    • /
    • 2007
  • We frequently use search engines to find relevant information in the Web but still end up with too much information. In order to solve this problem of information overload, ranking algorithms have been applied to various domains. As more information will be available in the future, effectively and efficiently ranking search results will become more critical. In this paper, we propose a ranking algorithm for the Semantic Web resources, specifically RDF resources. Traditionally, the importance of a particular Web page is estimated based on the number of key words found in the page, which is subject to manipulation. In contrast, link analysis methods such as Google's PageRank capitalize on the information which is inherent in the link structure of the Web graph. PageRank considers a certain page highly important if it is referred to by many other pages. The degree of the importance also increases if the importance of the referring pages is high. Kleinberg's algorithm is another link-structure based ranking algorithm for Web pages. Unlike PageRank, Kleinberg's algorithm utilizes two kinds of scores: the authority score and the hub score. If a page has a high authority score, it is an authority on a given topic and many pages refer to it. A page with a high hub score links to many authoritative pages. As mentioned above, the link-structure based ranking method has been playing an essential role in World Wide Web(WWW), and nowadays, many people recognize the effectiveness and efficiency of it. On the other hand, as Resource Description Framework(RDF) data model forms the foundation of the Semantic Web, any information in the Semantic Web can be expressed with RDF graph, making the ranking algorithm for RDF knowledge bases greatly important. The RDF graph consists of nodes and directional links similar to the Web graph. As a result, the link-structure based ranking method seems to be highly applicable to ranking the Semantic Web resources. However, the information space of the Semantic Web is more complex than that of WWW. For instance, WWW can be considered as one huge class, i.e., a collection of Web pages, which has only a recursive property, i.e., a 'refers to' property corresponding to the hyperlinks. However, the Semantic Web encompasses various kinds of classes and properties, and consequently, ranking methods used in WWW should be modified to reflect the complexity of the information space in the Semantic Web. Previous research addressed the ranking problem of query results retrieved from RDF knowledge bases. Mukherjea and Bamba modified Kleinberg's algorithm in order to apply their algorithm to rank the Semantic Web resources. They defined the objectivity score and the subjectivity score of a resource, which correspond to the authority score and the hub score of Kleinberg's, respectively. They concentrated on the diversity of properties and introduced property weights to control the influence of a resource on another resource depending on the characteristic of the property linking the two resources. A node with a high objectivity score becomes the object of many RDF triples, and a node with a high subjectivity score becomes the subject of many RDF triples. They developed several kinds of Semantic Web systems in order to validate their technique and showed some experimental results verifying the applicability of their method to the Semantic Web. Despite their efforts, however, there remained some limitations which they reported in their paper. First, their algorithm is useful only when a Semantic Web system represents most of the knowledge pertaining to a certain domain. In other words, the ratio of links to nodes should be high, or overall resources should be described in detail, to a certain degree for their algorithm to properly work. Second, a Tightly-Knit Community(TKC) effect, the phenomenon that pages which are less important but yet densely connected have higher scores than the ones that are more important but sparsely connected, remains as problematic. Third, a resource may have a high score, not because it is actually important, but simply because it is very common and as a consequence it has many links pointing to it. In this paper, we examine such ranking problems from a novel perspective and propose a new algorithm which can solve the problems under the previous studies. Our proposed method is based on a class-oriented approach. In contrast to the predicate-oriented approach entertained by the previous research, a user, under our approach, determines the weights of a property by comparing its relative significance to the other properties when evaluating the importance of resources in a specific class. This approach stems from the idea that most queries are supposed to find resources belonging to the same class in the Semantic Web, which consists of many heterogeneous classes in RDF Schema. This approach closely reflects the way that people, in the real world, evaluate something, and will turn out to be superior to the predicate-oriented approach for the Semantic Web. Our proposed algorithm can resolve the TKC(Tightly Knit Community) effect, and further can shed lights on other limitations posed by the previous research. In addition, we propose two ways to incorporate data-type properties which have not been employed even in the case when they have some significance on the resource importance. We designed an experiment to show the effectiveness of our proposed algorithm and the validity of ranking results, which was not tried ever in previous research. We also conducted a comprehensive mathematical analysis, which was overlooked in previous research. The mathematical analysis enabled us to simplify the calculation procedure. Finally, we summarize our experimental results and discuss further research issues.

A Morphological Study of Bamboos by Vascular Bundle Sheath (대나무류(類)의 유관속초(維管束鞘)에 의(依)한 형태학적(形態學的) 연구(硏究))

  • Kim, Jai Saing
    • Journal of Korean Society of Forest Science
    • /
    • 제25권1호
    • /
    • pp.13-47
    • /
    • 1975
  • Among the many species of bamboo, it is well known that the dwarf-type is widely distributed in the tropical regions, and the slender type in temperated zone. In the temperated zone the trees have extensively differentiated into one hundred species in 50 genera. In many oriental countries, the bamboo wood is being used as a material for construction and for the manufacture of technical instruments. The bamboo shoot is also regarded as a good and delicious edible resource. Moreover, recent medical investigation verifies that the sap of certain species of the bamboo is an antibiotic effect against cancer. Fortunately, it is very easy to propagate the bamboo trees by using cutting from southeastern Asian countries. This important resource can further be used as a significant source of pulp, which is becoming increasingly important. The classification system of this significant resource has not been completely established to date, even though its importance has been emphasized. Initiated by Canlevon Linne in the 18th century, a classification method concerning the morphological characteristics of flowers was the first step in developing a classification. But it was not an easy task to accomplish, because this type of classification system is based on the sexual organs in bamboo trees. Because the bamboo has a long life cycle of 60-120 years and classification according to this method was very difficult as the materials for the classification are not abundant and some species have changed, even though many references related to the morphological classification of bamboo trees are available nowadays. So, the certification of bamboo trees according to the morphological classification system is not reasonable for us. Consequently, the classification system of bamboo trees on the basis of endomorphological characteristics was initiated by Chinese-born Liese. And classification method based on the morphological characteristics of the vascular bundle was developed by Grosser. These classification methods are fundamentally related to Holltum's classification method, which stressed the morphology of the ovary. The author investigated to re-establish a new classification method based on the vascular sheath. Twenty-six species in 11 genera which originated from Formosa where used in the study. The results obtained from the investigation were somewhat coordinated with those of Crosser. Many difficulties were found in distinguishing the species of Bambusa and Dendrocalamus. These two species were critically differentiated under the new classification system, which is based on the existence of a separated vascular bundle sheath in the bamboo. According to these results, it is recommended that Babusa divided into two groups by placing it into either subspecies or the lower categories. This recommendation is supported by the observation that the evolutional pattern of the bamboo thunk which is from outward to inward. It is also supported by the viewpoint that the fundamental hypothesis in evolution is from simple to complex. There remained many problems to be solved through more critical examination by comparing the results to those of the classification based on the sexual organs method. The author observed the figure of the cross-sectional area of vascular trunk of bamboo tree and compared the results with those of Grosser and Liese, i.e. A, $B_1$, $B_2$, C, and D groups in classification. Group A and $B_2$ were in accordance with the results of those scholars, while group D showed many differences, Grosser and Liese divided bamboo into "g" type and "h" type according to the vascular bundle type; and they included Dendrocalamus and Bambusa in Group D without considering the type of vascular bundle sheath. However, the results obtained by the author showed that Dendrocalamus and Bambusa are differentiated from each other. By considering another group, "i" identified according to the existence of separated vascular bundle sheath. Bambusa showed to have a separated vascular bundle sheath while Dendrocalamus does not have a separated vascular bundle sheath. Moreover, Bambusa showed peculiar characteristics in the figure of vascular development, i.e., one with an inward vascular bundle sheath and the other with a bivascular bundle sheath (inward and outward). In conclusion, the bamboo species used in this experiment were classified in group D, without any separated vascular bundle sheath, and in group E, with a vascular bundle sheath. Group E was divided into two groups, i.e., and group $E_1$, with bivascular sheath, and group $E_2$, with only an inward vascular sheath. Therefore, the Bambusa in group D as described by Grosser and Liese was included in group E. Dendrocalamus seemed to be the middle group between group $E_l$ and group $E_2$ under this classification system which is summarized as follows: Phyllostachys-type: Group A - Phyllostachys, Chymonobambus, Arundinaria, Pseudosasa, Pleioblastus, Yashania Pome-type: Group $B_2$ - Schizostachyum, Melocanna Hemp-type: Group D - Dendrocalamu Bambu-type: Group $E_1$ - Bambusa ghi.

  • PDF