• Title/Summary/Keyword: ensemble method

Search Result 508, Processing Time 0.031 seconds

Link Prediction in Bipartite Network Using Composite Similarities

  • Bijay Gaudel;Deepanjal Shrestha;Niosh Basnet;Neesha Rajkarnikar;Seung Ryul Jeong;Donghai Guan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.8
    • /
    • pp.2030-2052
    • /
    • 2023
  • Analysis of a bipartite (two-mode) network is a significant research area to understand the formation of social communities, economic systems, drug side effect topology, etc. in complex information systems. Most of the previous works talk about a projection-based model or latent feature model, which predicts the link based on singular similarity. The projection-based models suffer from the loss of structural information in the projected network and the latent feature is hardly present. This work proposes a novel method for link prediction in the bipartite network based on an ensemble of composite similarities, overcoming the issues of model-based and latent feature models. The proposed method analyzes the structure, neighborhood nodes as well as latent attributes between the nodes to predict the link in the network. To illustrate the proposed method, experiments are performed with five real-world data sets and compared with various state-of-art link prediction methods and it is inferred that this method outperforms with ~3% to ~9% higher using area under the precision-recall curve (AUC-PR) measure. This work holds great significance in the study of biological networks, e-commerce networks, complex web-based systems, networks of drug binding, enzyme protein, and other related networks in understanding the formation of such complex networks. Further, this study helps in link prediction and its usability for different purposes ranging from building intelligent systems to providing services in big data and web-based systems.

Proposal of a Learning Model for Mobile App Malicious Code Analysis (모바일 앱 악성코드 분석을 위한 학습모델 제안)

  • Bae, Se-jin;Choi, Young-ryul;Rhee, Jung-soo;Baik, Nam-kyun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.10a
    • /
    • pp.455-457
    • /
    • 2021
  • App is used on mobile devices such as smartphones and also has malicious code, which can be divided into normal and malicious depending on the presence or absence of hacking codes. Because there are many kind of malware, it is difficult to detect directly, we propose a method to detect malicious app using AI. Most of the existing methods are to detect malicious app by extracting features from malicious app. However, the number of types have increased exponentially, making it impossible to detect malicious code. Therefore, we would like to propose two more methods besides detecting malicious app by extracting features from most existing malicious app. The first method is to learn normal app to extract normal's features, as opposed to the existing method of learning malicious app and find abnormalities (malicious app). The second one is an 'ensemble technique' that combines the existing method with the first proposal. These two methods need to be studied so that they can be used in future mobile environment.

  • PDF

A hybrid algorithm based on EEMD and EMD for multi-mode signal processing

  • Lin, Jeng-Wen
    • Structural Engineering and Mechanics
    • /
    • v.39 no.6
    • /
    • pp.813-831
    • /
    • 2011
  • This paper presents an efficient version of Hilbert-Huang transform for nonlinear non-stationary systems analyses. An ensemble empirical mode decomposition (EEMD) is introduced to alleviate the problem of mode mixing between intrinsic mode functions (IMFs) decomposed by EMD. Yet the problem has not been fully resolved when a signal of a similar scale resides in different IMF components. Instead of using a trial and error method to select the "best" outcome generated by EEMD, a hybrid algorithm based on EEMD and EMD is proposed for multi-mode signal processing. The developed approach comprises the steps from a bandpass filter design for regrouping modes of the IMFs obtained from EEMD, to the mode extraction using EMD, and to the assessment of each mode in the marginal spectrum. A simulated two-mode signal is tested to demonstrate the efficiency and robustness of the approach, showing average relative errors all equal to 1.46% for various noise levels added to the signal. The developed approach is also applied to a real bridge structure, showing more reliable results than the pure EMD. Discussions on the mode determination are offered to explain the connection between modegrouping form on the one hand, and mode-grouping performance on the other.

Elimination of environmental temperature effect from the variation of stay cable force based on simple temperature measurements

  • Chen, Chien-Chou;Wu, Wen-Hwa;Liu, Chun-Yan;Lai, Gwolong
    • Smart Structures and Systems
    • /
    • v.19 no.2
    • /
    • pp.137-149
    • /
    • 2017
  • Under the interference of the temperature effect, the alternation of cable force due to damages of a cable-stayed bridge could be difficult to distinguish. Considering the convenience and applicability in engineering practice, simple air or cable temperature measurements are adopted in the current study for the exclusion of temperature effect from the variation of cable force. Using the data collected from Ai-Lan Bridge located in central Taiwan, this work applies the ensemble empirical mode decomposition to process the time histories of cable force, air temperature, and cable temperature. It is evidently observed that the cable force and both types of temperature can all be categorized as the daily variation, long-term variation, and high-frequency noise in the order of decreasing weight. Moreover, the correlation analysis conducted for the decomposed variations of all these three quantities undoubtedly indicates that the daily and long-term variations with different time shifts have to be distinguished for accurately evaluating the temperature effect on the variation of cable force. Finally, consistent results in reducing the range of cable force variation after the elimination of temperature effect confirm the validity and stability of the developed method.

A Prediction of Precipitation Over East Asia for June Using Simultaneous and Lagged Teleconnection (원격상관을 이용한 동아시아 6월 강수의 예측)

  • Lee, Kang-Jin;Kwon, MinHo
    • Atmosphere
    • /
    • v.26 no.4
    • /
    • pp.711-716
    • /
    • 2016
  • The dynamical model forecasts using state-of-art general circulation models (GCMs) have some limitations to simulate the real climate system since they do not depend on the past history. One of the alternative methods to correct model errors is to use the canonical correlation analysis (CCA) correction method. CCA forecasts at the present time show better skill than dynamical model forecasts especially over the midlatitudes. Model outputs are adjusted based on the CCA modes between the model forecasts and the observations. This study builds a canonical correlation prediction model for subseasonal (June) precipitation. The predictors are circulation fields over western North Pacific from the Global Seasonal Forecasting System version 5 (GloSea5) and observed snow cover extent over Eurasia continent from Climate Data Record (CDR). The former is based on simultaneous teleconnection between the western North Pacific and the East Asia, and the latter on lagged teleconnection between the Eurasia continent and the East Asia. In addition, we suggest a technique for improving forecast skill by applying the ensemble canonical correlation (ECC) to individual canonical correlation predictions.

Study on the Functional Architecture and Improvement Accuracy for Auto Target Classification on the SAR Image by using CNN Ensemble Model based on the Radar System for the Fighter (전투기용 레이다 기반 SAR 영상 자동표적분류 기능 구조 및 CNN 앙상블 모델을 이용한 표적분류 정확도 향상 방안 연구)

  • Lim, Dong Ju;Song, Se Ri;Park, Peom
    • Journal of the Korean Society of Systems Engineering
    • /
    • v.16 no.1
    • /
    • pp.51-57
    • /
    • 2020
  • The fighter pilot uses radar mounted on the fighter to obtain high-resolution SAR (Synthetic Aperture Radar) images for a specific area of distance, and then the pilot visually classifies targets within the image. However, the target configuration captured in the SAR image is relatively small in size, and distortion of that type occurs depending on the depression angle, making it difficult for pilot to classify the type of target. Also, being present with various types of clutters, there should be errors in target classification and pilots should be even worse if tasks such as navigation and situational awareness are carried out simultaneously. In this paper, the concept of operation and functional structure of radar system for fighter jets were presented to transfer the SAR image target classification task of fighter pilots to radar system, and the method of target classification with high accuracy was studied using the CNN ensemble model to archive higher classification accuracy than single CNN model.

IMPROVING THE ESP ACCURACY WITH COMBINATION OF PROBABILISTIC FORECASTS

  • Yu, Seung-Oh;Kim, Young-Oh
    • Water Engineering Research
    • /
    • v.5 no.2
    • /
    • pp.101-109
    • /
    • 2004
  • Aggregating information by combining forecasts from two or more forecasting methods is an alternative to using forecasts from just a single method to improve forecast accuracy. This paper describes the development and use of a monthly inflow forecast model based on an optimal linear combination (OLC) of forecasts derived from naive, persistence, and Ensemble Streamflow Prediction (ESP) forecasts. Using the cross-validation technique, the OLC model made 1-month ahead probabilistic forecasts for the Chungju multi-purpose dam inflows for 15 years. For most of the verification months, the skill associated with the OLC forecast was superior to those drawn from the individual forecast techniques. Therefore this study demonstrates that OLC can improve the accuracy of the ESP forecast, especially during the dry season. This study also examined the value of the OLC forecasts in reservoir operations. Stochastic Dynamic Programming (SDP) derived the optimal operating policy for the Chungju multi-purpose dam operation and the derived policy was simulated using the 15-year observed inflows. The simulation results showed the SDP model that updated its probability from the new OLC forecast provided more efficient operation decisions than the conventional SDP model.

  • PDF

Energy Efficient Design of a Jet Pump by Ensemble of Surrogates and Evolutionary Approach

  • Husain, Afzal;Sonawat, Arihant;Mohan, Sarath;Samad, Abdus
    • International Journal of Fluid Machinery and Systems
    • /
    • v.9 no.3
    • /
    • pp.265-276
    • /
    • 2016
  • Energy systems working coherently in different conditions may not have a specific design which can provide optimal performance. A system working for a longer period at lower efficiency implies higher energy consumption. In this effort, a methodology demonstrated by a jet pump design and optimization via numerical modeling for fluid dynamics and implementation of an evolutionary algorithm for the optimization shows a reduction in computational costs. The jet pump inherently has a low efficiency because of improper mixing of primary and secondary fluids, and multiple momentum and energy transfer phenomena associated with it. The high fidelity solutions were obtained through a validated numerical model to construct an approximate function through surrogate analysis. Pareto-optimal solutions for two objective functions, i.e., secondary fluid pressure head and primary fluid pressure-drop, were generated through a multi-objective genetic algorithm. For the jet pump geometry, a design space of several design variables was discretized using the Latin hypercube sampling method for the optimization. The performance analysis of the surrogate models shows that the combined surrogates perform better than a single surrogate and the optimized jet pump shows a higher performance. The approach can be implemented in other energy systems to find a better design.

Outlier detection of main engine data of a ship using ensemble method (앙상블 기법을 이용한 선박 메인엔진 빅데이터의 이상치 탐지)

  • KIM, Dong-Hyun;LEE, Ji-Hwan;LEE, Sang-Bong;JUNG, Bong-Kyu
    • Journal of the Korean Society of Fisheries and Ocean Technology
    • /
    • v.56 no.4
    • /
    • pp.384-394
    • /
    • 2020
  • This paper proposes an outlier detection model based on machine learning that can diagnose the presence or absence of major engine parts through unsupervised learning analysis of main engine big data of a ship. Engine big data of the ship was collected for more than seven months, and expert knowledge and correlation analysis were performed to select features that are closely related to the operation of the main engine. For unsupervised learning analysis, ensemble model wherein many predictive models are strategically combined to increase the model performance, is used for anomaly detection. As a result, the proposed model successfully detected the anomalous engine status from the normal status. To validate our approach, clustering analysis was conducted to find out the different patterns of anomalies the anomalous point. By examining distribution of each cluster, we could successfully find the patterns of anomalies.

Building an Ensemble Machine by Constructive Selective Learning Neural Networks (건설적 선택학습 신경망을 이용한 앙상블 머신의 구축)

  • Kim, Seok-Jun;Jang, Byeong-Tak
    • Journal of KIISE:Software and Applications
    • /
    • v.27 no.12
    • /
    • pp.1202-1210
    • /
    • 2000
  • 본 논문에서는 효과적인 앙상블 머신의 구축을 위한 새로운 방안을 제시한다. 효과적인 앙상블의 구축을 위해서는 앙상블 멤버들간의 상관관계가 아주 낮아야 하며 또한 각 앙상블 멤버들은 전체 문제를 어느 정도는 정확하게 학습하면서도 서로들간의 불일치 하는 부분이 존재해야 한다는 것이 여러 논문들에 발표되었다. 본 논문에서는 주어진 문제의 다양한 면을 학습한 다수의 앙상블 후보 네트웍을 생성하기 위하여 건설적 학습 알고리즘과 능동 학습 알고리즘을 결합한 형태의 신경망 학습 알고리즘을 이용한다. 이 신경망의 학습은 최소 은닉 노드에서 최대 은닉노드까지 점진적으로 은닉노드를 늘려나감과 동시에 후보 데이타 집합에서 학습에 사용할 훈련 데이타를 점진적으로 선택해 나가면서 이루어진다. 은닉 노드의 증가시점에서 앙상블의 후부 네트웍이 생성된다. 이러한 한 차례의 학습 진행을 한 chain이라 정의한다. 다수의 chain을 통하여 다양한 형태의 네트웍 크기와 다양한 형태의 데이타 분포를 학습한 후보 내트웍들이 생성된다. 이렇게 생성된 후보 네트웍들은 확률적 비례 선택법에 의해 선택된 후 generalized ensemble method (GEM)에 의해 결합되어 최종적인 앙상블 성능을 보여준다. 제안된 알고리즘은 한개의 인공 데이타와 한 개의 실세계 데이타에 적용되었다. 실험을 통하여 제안된 알고리즘에 의해 구성된 앙상블의 최대 일반화 성능은 다른 알고리즘에 의한 그것보다 우수함을 알 수 있다.

  • PDF