• Title/Summary/Keyword: Tree algorithm

Search Result 1,726, Processing Time 0.028 seconds

Hybrid machine learning with HHO method for estimating ultimate shear strength of both rectangular and circular RC columns

  • Quang-Viet Vu;Van-Thanh Pham;Dai-Nhan Le;Zhengyi Kong;George Papazafeiropoulos;Viet-Ngoc Pham
    • Steel and Composite Structures
    • /
    • v.52 no.2
    • /
    • pp.145-163
    • /
    • 2024
  • This paper presents six novel hybrid machine learning (ML) models that combine support vector machines (SVM), Decision Tree (DT), Random Forest (RF), Gradient Boosting (GB), extreme gradient boosting (XGB), and categorical gradient boosting (CGB) with the Harris Hawks Optimization (HHO) algorithm. These models, namely HHO-SVM, HHO-DT, HHO-RF, HHO-GB, HHO-XGB, and HHO-CGB, are designed to predict the ultimate strength of both rectangular and circular reinforced concrete (RC) columns. The prediction models are established using a comprehensive database consisting of 325 experimental data for rectangular columns and 172 experimental data for circular columns. The ML model hyperparameters are optimized through a combination of cross-validation technique and the HHO. The performance of the hybrid ML models is evaluated and compared using various metrics, ultimately identifying the HHO-CGB model as the top-performing model for predicting the ultimate shear strength of both rectangular and circular RC columns. The mean R-value and mean a20-index are relatively high, reaching 0.991 and 0.959, respectively, while the mean absolute error and root mean square error are low (10.302 kN and 27.954 kN, respectively). Another comparison is conducted with four existing formulas to further validate the efficiency of the proposed HHO-CGB model. The Shapely Additive Explanations method is applied to analyze the contribution of each variable to the output within the HHO-CGB model, providing insights into the local and global influence of variables. The analysis reveals that the depth of the column, length of the column, and axial loading exert the most significant influence on the ultimate shear strength of RC columns. A user-friendly graphical interface tool is then developed based on the HHO-CGB to facilitate practical and cost-effective usage.

Comparison and Evaluation of Data Collection System Database for Edge-Based Lightweight Platform (엣지 기반 경량화 플랫폼을 위한 데이터 수집 시스템의 데이터베이스 비교 및 평가)

  • Woojin Cho;Chae-young Lim;Jae-hoi Gu
    • Journal of Platform Technology
    • /
    • v.11 no.5
    • /
    • pp.49-58
    • /
    • 2023
  • Factory energy management system is rapidly growing and evolving due to factors such as the 3rd Basic Energy Plan and global energy cost increases, as well as environmental issues. However, implementing an essential data collection system for energy management in factory settings, which have limited space and unique characteristics, presents spatial, environmental, and energy-related challenges. This paper endeavors to mitigate these challenges by devising a data collection system implemented through an edge-based lightweight platform. A comparison and evaluation of database operation on edge devices are conducted. To conduct the evaluation, a benchmarking tool called CDI Benchmark is developed, utilizing the characteristics of existing factories involved in practical applications. The evaluation results revealed that RDBMS systems like MySQL encountered errors in the database due to high data insertion loads, making them inoperable. On the other hand, InfluxDB, thanks to its highly efficient compression algorithm, demonstrated compression rates about 6 times higher than MyRocks according to the evaluation. However, it was observed that MyRocks outperformed InfluxDB by a significant margin, recording a maximum processing time approximately 80 times faster compared to InfluxDB.

  • PDF

Predicting Crime Risky Area Using Machine Learning (머신러닝기반 범죄발생 위험지역 예측)

  • HEO, Sun-Young;KIM, Ju-Young;MOON, Tae-Heon
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.21 no.4
    • /
    • pp.64-80
    • /
    • 2018
  • In Korea, citizens can only know general information about crime. Thus it is difficult to know how much they are exposed to crime. If the police can predict the crime risky area, it will be possible to cope with the crime efficiently even though insufficient police and enforcement resources. However, there is no prediction system in Korea and the related researches are very much poor. From these backgrounds, the final goal of this study is to develop an automated crime prediction system. However, for the first step, we build a big data set which consists of local real crime information and urban physical or non-physical data. Then, we developed a crime prediction model through machine learning method. Finally, we assumed several possible scenarios and calculated the probability of crime and visualized the results in a map so as to increase the people's understanding. Among the factors affecting the crime occurrence revealed in previous and case studies, data was processed in the form of a big data for machine learning: real crime information, weather information (temperature, rainfall, wind speed, humidity, sunshine, insolation, snowfall, cloud cover) and local information (average building coverage, average floor area ratio, average building height, number of buildings, average appraised land value, average area of residential building, average number of ground floor). Among the supervised machine learning algorithms, the decision tree model, the random forest model, and the SVM model, which are known to be powerful and accurate in various fields were utilized to construct crime prevention model. As a result, decision tree model with the lowest RMSE was selected as an optimal prediction model. Based on this model, several scenarios were set for theft and violence cases which are the most frequent in the case city J, and the probability of crime was estimated by $250{\times}250m$ grid. As a result, we could find that the high crime risky area is occurring in three patterns in case city J. The probability of crime was divided into three classes and visualized in map by $250{\times}250m$ grid. Finally, we could develop a crime prediction model using machine learning algorithm and visualized the crime risky areas in a map which can recalculate the model and visualize the result simultaneously as time and urban conditions change.

Mining Frequent Trajectory Patterns in RFID Data Streams (RFID 데이터 스트림에서 이동궤적 패턴의 탐사)

  • Seo, Sung-Bo;Lee, Yong-Mi;Lee, Jun-Wook;Nam, Kwang-Woo;Ryu, Keun-Ho;Park, Jin-Soo
    • Journal of Korea Spatial Information System Society
    • /
    • v.11 no.1
    • /
    • pp.127-136
    • /
    • 2009
  • This paper proposes an on-line mining algorithm of moving trajectory patterns in RFID data streams considering changing characteristics over time and constraints of single-pass data scan. Since RFID, sensor, and mobile network technology have been rapidly developed, many researchers have been recently focused on the study of real-time data gathering from real-world and mining the useful patterns from them. Previous researches for sequential patterns or moving trajectory patterns based on stream data have an extremely time-consum ing problem because of multi-pass database scan and tree traversal, and they also did not consider the time-changing characteristics of stream data. The proposed method preserves the sequential strength of 2-lengths frequent patterns in binary relationship table using the time-evolving graph to exactly reflect changes of RFID data stream from time to time. In addition, in order to solve the problem of the repetitive data scans, the proposed algorithm infers candidate k-lengths moving trajectory patterns beforehand at a time point t, and then extracts the patterns after screening the candidate patterns by only one-pass at a time point t+1. Through the experiment, the proposed method shows the superior performance in respect of time and space complexity than the Apriori-like method according as the reduction ratio of candidate sets is about 7 percent.

  • PDF

An Optimized Combination of π-fuzzy Logic and Support Vector Machine for Stock Market Prediction (주식 시장 예측을 위한 π-퍼지 논리와 SVM의 최적 결합)

  • Dao, Tuanhung;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.43-58
    • /
    • 2014
  • As the use of trading systems has increased rapidly, many researchers have become interested in developing effective stock market prediction models using artificial intelligence techniques. Stock market prediction involves multifaceted interactions between market-controlling factors and unknown random processes. A successful stock prediction model achieves the most accurate result from minimum input data with the least complex model. In this research, we develop a combination model of ${\pi}$-fuzzy logic and support vector machine (SVM) models, using a genetic algorithm to optimize the parameters of the SVM and ${\pi}$-fuzzy functions, as well as feature subset selection to improve the performance of stock market prediction. To evaluate the performance of our proposed model, we compare the performance of our model to other comparative models, including the logistic regression, multiple discriminant analysis, classification and regression tree, artificial neural network, SVM, and fuzzy SVM models, with the same data. The results show that our model outperforms all other comparative models in prediction accuracy as well as return on investment.

Shifts of Geographic Distribution of Pinus koraiensis Based on Climate Change Scenarios and GARP Model (GARP 모형과 기후변화 시나리오에 따른 잣나무의 지리적 분포 변화)

  • Chun, Jung Hwa;Lee, Chang Bae;Yoo, So Min
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.17 no.4
    • /
    • pp.348-357
    • /
    • 2015
  • The main purpose of this study is to understand the potential geographic distribution of P. koraiensis, which is known to be one of major economic tree species, based on the RCP (Representative Concentration Pathway) 8.5 scenarios and current geographic distribution from National Forest Inventory(NFI) data using ecological niche modeling. P. koraiensis abundance data extracted from NFI were utilized to estimate current geographic distribution. Also, GARP (Genetic Algorithm for Rule-set Production) model, one of the ecological niche models, was applied to estimate potential geographic distribution and to project future changes. Environmental explanatory variables showing Area Under Curve (AUC) value bigger than 0.6 were selected and constructed into the final model by running the model for each of the 27 variables. The results of the model validation which was performed based on confusion matrix statistics, showed quite high suitability. Currently P. koraiensis is distributed widely from 300m to 1,200m in altitude and from south to north as a result of national greening project in 1970s although major populations are found in elevated and northern area. The results of this study were successful in showing the current distribution of P. koraiensis and projecting their future changes. Future model for P. koraiensis suggest large areas predicted under current climate conditions may be contracted by 2090s showing dramatic habitat loss. Considering the increasing status of atmospheric $CO_2$ and air temperature in Korea, P. koraiensis seems to experience the significant decrease of potential distribution range in the future. The final model in this study may be used to identify climate change impacts on distribution of P. koraiensis in Korea, and a deeper understanding of its correlation may be helpful when planning afforestation strategies.

Design of ATM Switch-based on a Priority Control Algorithm (우선순위 알고리즘을 적용한 상호연결 망 구조의 ATM 스위치 설계)

  • Cho Tae-Kyung;Cho Dong-Uook;Park Byoung-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.4 no.4
    • /
    • pp.189-196
    • /
    • 2004
  • Most of the recent researches for ATM switches have been based on multistage interconnection network known as regularity and self-routing property. These networks can switch packets simultaneously and in parallel. However, they are blocking networks in the sense that packet is capable of collision with each other Mainly Banyan network have been used for structure. There are several ways to reduce the blocking or to increase the throughput of banyan-type switches: increasing the internal link speeds, placing buffers in each switching node, using multiple path, distributing the load evenly in front of the banyan network and so on. Therefore, this paper proposes the use of recirculating shuffle-exchange network to reduce the blocking and to improve hardware complexity. This structures are recirculating shuffle-exchange network as simplified in hardware complexity and Rank network with tree structure which send only a packet with highest priority to the next network, and recirculate the others to the previous network. after it decides priority number on the Packets transferred to the same destination, The transferred Packets into banyan network use the function of self routing through decomposition and composition algorithm and all they arrive at final destinations. To analyze throughput, waiting time and packet loss ratio according to the size of buffer, the probabilities are modeled by a binomial distribution of packet arrival. If it is 50 percentage of load, the size of buffer is more than 15. It means the acceptable packet loss ratio. Therefore, this paper simplify the hardware complexity as use of recirculating shuffle-exchange network instead of bitonic sorter.

  • PDF

Construction of Research Fronts Using Factor Graph Model in the Biomedical Literature (팩터그래프 모델을 이용한 연구전선 구축: 생의학 분야 문헌을 기반으로)

  • Kim, Hea-Jin;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.34 no.1
    • /
    • pp.177-195
    • /
    • 2017
  • This study attempts to infer research fronts using factor graph model based on heterogeneous features. The model suggested by this study infers research fronts having documents with the potential to be cited multiple times in the future. To this end, the documents are represented by bibliographic, network, and content features. Bibliographic features contain bibliographic information such as the number of authors, the number of institutions to which the authors belong, proceedings, the number of keywords the authors provide, funds, the number of references, the number of pages, and the journal impact factor. Network features include degree centrality, betweenness, and closeness among the document network. Content features include keywords from the title and abstract using keyphrase extraction techniques. The model learns these features of a publication and infers whether the document would be an RF using sum-product algorithm and junction tree algorithm on a factor graph. We experimentally demonstrate that when predicting RFs, the FG predicted more densely connected documents than those predicted by RFs constructed using a traditional bibliometric approach. Our results also indicate that FG-predicted documents exhibit stronger degrees of centrality and betweenness among RFs.

Efficient Coding of Motion Vector and Mode Information for H.264/AVC (H.264/AVC에서 효율적인 움직임 벡터와 모드 정보의 압축)

  • Lee, Dong-Shik;Kim, Young-Mo
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.10
    • /
    • pp.1359-1365
    • /
    • 2008
  • The portion of header in H.264 gets higher than those of previous standards instead of its better compression efficiency. Therefore, this paper proposes a new technique to compress the header of H.264. Unifying a sentence elementary in H.264, H.264 does not consider the distribution of element which be encoded and uses existing Exp-Golomb method, but it is uneffective for variable length coding. Most of the header are block type(s) and motion vector difference(s), and there are redundancies in the header of H.264. The redundancies in the header of H.264 which are analyzed in this paper are three. There are frequently appearing symbols and non-frequently appearing symbols in block types. And when mode 8 is selected in macroblock, all of four sub-macroblock types are transferred. At last, same values come in motion vector difference, especially '0.' This paper proposes the algorithm using type code and quadtree, and with them presents the redundant information of header in H.264. The type code indicates shape of the macroblock and the quadtree does the tree structured motion compensation. Experimental results show that proposed algorithm achieves lower total number of encoded bits over JM12.4 up to 32.51% bit reduction.

  • PDF

A Strategy of the Link Saving Routing and Its Characteristics for QoS Aware Energy Saving(QAES) in IP Networks (IP Network에서 QoS Aware Energy Saving(QAES)을 위한 링크 절약 라우팅의 한 방법 및 특성)

  • Han, Chimoon;Kim, Sangchul
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.5
    • /
    • pp.76-87
    • /
    • 2014
  • Today the energy consumption of ICT networks is about 10% of the worldwide power consumption and is predicted to increase remarkably in the near future. For this reason, this paper studies energy saving strategies assuring the network-level QoS. In the strategies, the energy consumption of NIC(network interface card) on both endpoint of links decreases by selecting links and making them sleep when the total traffic volume of the IP network is lower than a threshold. In this paper, we propose a heuristic routing algorithm based on so-called delegating/delegated routers, and evaluate its characteristics using computer simulation considering network-level QoS. The selection of sleep links is determined in terms of the number of traffic paths (called min_used path) or the amount of traffics(called min_used traffic) through those kinks. To our experiment, the min_used traffic method shows a little better energy saving but the increased path length compared to the min_used path method. Those two methods have better energy saving characteristics than the random method. This paper confirms that the delegating/delegated router-based routing algorithm results in energy saving effects and sustains network-level QoS in IP networks.