• Title/Summary/Keyword: metrics validation

Search Result 61, Processing Time 0.022 seconds

Machine Learning Methods for Trust-based Selection of Web Services

  • Hasnain, Muhammad;Ghani, Imran;Pasha, Muhammad F.;Jeong, Seung R.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권1호
    • /
    • pp.38-59
    • /
    • 2022
  • Web services instances can be classified into two categories, namely trusted and untrusted from users. A web service with high throughput (TP) and low response time (RT) instance values is a trusted web service. Web services are not trustworthy due to the mismatch in the guaranteed instance values and the actual values achieved by users. To perform web services selection from users' attained TP and RT values, we need to verify the correct prediction of trusted and untrusted instances from invoked web services. This accurate prediction of web services instances is used to perform the selection of web services. We propose to construct fuzzy rules to label web services instances correctly. This paper presents web services selection using a well-known machine learning algorithm, namely REPTree, for the correct prediction of trusted and untrusted instances. Performance comparison of REPTree with five machine learning models is conducted on web services datasets. We have performed experiments on web services datasets using a ten k-fold cross-validation method. To evaluate the performance of the REPTree classifier, we used accuracy metrics (Sensitivity and Specificity). Experimental results showed that web service (WS1) gained top selection score with the (47.0588%) trusted instances, and web service (WS2) was selected the least with (25.00%) trusted instances. Evaluation results of the proposed web services selection approach were found as (asymptotic sig. = 0.019), demonstrating the relationship between final selection and recommended trust score of web services.

기계학습을 이용한 밴드갭 예측과 소재의 조성기반 특성인자의 효과 (Compositional Feature Selection and Its Effects on Bandgap Prediction by Machine Learning)

  • 남충희
    • 한국재료학회지
    • /
    • 제33권4호
    • /
    • pp.164-174
    • /
    • 2023
  • The bandgap characteristics of semiconductor materials are an important factor when utilizing semiconductor materials for various applications. In this study, based on data provided by AFLOW (Automatic-FLOW for Materials Discovery), the bandgap of a semiconductor material was predicted using only the material's compositional features. The compositional features were generated using the python module of 'Pymatgen' and 'Matminer'. Pearson's correlation coefficients (PCC) between the compositional features were calculated and those with a correlation coefficient value larger than 0.95 were removed in order to avoid overfitting. The bandgap prediction performance was compared using the metrics of R2 score and root-mean-squared error. By predicting the bandgap with randomforest and xgboost as representatives of the ensemble algorithm, it was found that xgboost gave better results after cross-validation and hyper-parameter tuning. To investigate the effect of compositional feature selection on the bandgap prediction of the machine learning model, the prediction performance was studied according to the number of features based on feature importance methods. It was found that there were no significant changes in prediction performance beyond the appropriate feature. Furthermore, artificial neural networks were employed to compare the prediction performance by adjusting the number of features guided by the PCC values, resulting in the best R2 score of 0.811. By comparing and analyzing the bandgap distribution and prediction performance according to the material group containing specific elements (F, N, Yb, Eu, Zn, B, Si, Ge, Fe Al), various information for material design was obtained.

Metric based Performance Measurement of Software Development Methodologies from Traditional to DevOps Automation Culture

  • Poonam Narang;Pooja Mittal
    • International Journal of Computer Science & Network Security
    • /
    • 제23권6호
    • /
    • pp.107-114
    • /
    • 2023
  • Successful implementations of DevOps practices significantly improvise software efficiency, collaboration and security. Most of the organizations are adopting DevOps for faster and quality software delivery. DevOps brings development and operation teams together to overcome all kind of communication gaps responsible for software failures. It relies on different sets of alternative tools to automate the tasks of continuous integration, testing, delivery, deployment and monitoring. Although DevOps is followed for being very reliable and responsible environment for quality software delivery yet it lacks many quantifiable aspects to prove it on the top of other traditional and agile development methods. This research evaluates quantitative performance of DevOps and traditional/ agile development methods based on software metrics. This research includes three sample projects or code repositories to quantify the results and for DevOps integrated selective tool chain; current research considers our earlier proposed and implemented DevOps hybrid model of integrated automation tools. For result discussion and validation, tabular and graphical comparisons have also been included to retrieve best performer model. This comparative and evaluative research will be of much advantage to our young researchers/ students to get well versed with automotive environment of DevOps, latest emerging buzzword of development industries.

Identification of Pb-Zn ore under the condition of low count rate detection of slim hole based on PGNAA technology

  • Haolong Huang;Pingkun Cai;Wenbao Jia;Yan Zhang
    • Nuclear Engineering and Technology
    • /
    • 제55권5호
    • /
    • pp.1708-1717
    • /
    • 2023
  • The grade analysis of lead-zinc ore is the basis for the optimal development and utilization of deposits. In this study, a method combining Prompt Gamma Neutron Activation Analysis (PGNAA) technology and machine learning is proposed for lead-zinc mine borehole logging, which can identify lead-zinc ores of different grades and gangue in the formation, providing real-time grade information qualitatively and semi-quantitatively. Firstly, Monte Carlo simulation is used to obtain a gamma-ray spectrum data set for training and testing machine learning classification algorithms. These spectra are broadened, normalized and separated into inelastic scattering and capture spectra, and then used to fit different classifier models. When the comprehensive grade boundary of high- and low-grade ores is set to 5%, the evaluation metrics calculated by the 5-fold cross-validation show that the SVM (Support Vector Machine), KNN (K-Nearest Neighbor), GNB (Gaussian Naive Bayes) and RF (Random Forest) models can effectively distinguish lead-zinc ore from gangue. At the same time, the GNB model has achieved the optimal accuracy of 91.45% when identifying high- and low-grade ores, and the F1 score for both types of ores is greater than 0.9.

콘크리트 탄산화 및 열효과에 의한 경년열화 예측을 위한 기계학습 모델의 정확성 검토 (Accuracy Evaluation of Machine Learning Model for Concrete Aging Prediction due to Thermal Effect and Carbonation)

  • 김현수
    • 한국공간구조학회논문집
    • /
    • 제23권4호
    • /
    • pp.81-88
    • /
    • 2023
  • Numerous factors contribute to the deterioration of reinforced concrete structures. Elevated temperatures significantly alter the composition of the concrete ingredients, consequently diminishing the concrete's strength properties. With the escalation of global CO2 levels, the carbonation of concrete structures has emerged as a critical challenge, substantially affecting concrete durability research. Assessing and predicting concrete degradation due to thermal effects and carbonation are crucial yet intricate tasks. To address this, multiple prediction models for concrete carbonation and compressive strength under thermal impact have been developed. This study employs seven machine learning algorithms-specifically, multiple linear regression, decision trees, random forest, support vector machines, k-nearest neighbors, artificial neural networks, and extreme gradient boosting algorithms-to formulate predictive models for concrete carbonation and thermal impact. Two distinct datasets, derived from reported experimental studies, were utilized for training these predictive models. Performance evaluation relied on metrics like root mean square error, mean square error, mean absolute error, and coefficient of determination. The optimization of hyperparameters was achieved through k-fold cross-validation and grid search techniques. The analytical outcomes demonstrate that neural networks and extreme gradient boosting algorithms outshine the remaining five machine learning approaches, showcasing outstanding predictive performance for concrete carbonation and thermal effect modeling.

경년열화를 고려한 전단벽 구조물의 기계학습 기반 지진응답 예측모델 개발 (Development of Machine Learning Based Seismic Response Prediction Model for Shear Wall Structure considering Aging Deteriorations)

  • 김현수;김유경;이소연;장준수
    • 한국공간구조학회논문집
    • /
    • 제24권2호
    • /
    • pp.83-90
    • /
    • 2024
  • Machine learning is widely applied to various engineering fields. In structural engineering area, machine learning is generally used to predict structural responses of building structures. The aging deterioration of reinforced concrete structure affects its structural behavior. Therefore, the aging deterioration of R.C. structure should be consider to exactly predict seismic responses of the structure. In this study, the machine learning based seismic response prediction model was developed. To this end, four machine learning algorithms were employed and prediction performance of each algorithm was compared. A 3-story coupled shear wall structure was selected as an example structure for numerical simulation. Artificial ground motions were generated based on domestic site characteristics. Elastic modulus, damping ratio and density were changed to considering concrete degradation due to chloride penetration and carbonation, etc. Various intensity measures were used input parameters of the training database. Performance evaluation was performed using metrics like root mean square error, mean square error, mean absolute error, and coefficient of determination. The optimization of hyperparameters was achieved through k-fold cross-validation and grid search techniques. The analysis results show that neural networks and extreme gradient boosting algorithms present good prediction performance.

임분 상하층의 바이오매스 조사를 위한 백팩형 라이다와 드론 라이다의 적용성 평가 (Backpack- and UAV-based Laser Scanning Application for Estimating Overstory and Understory Biomass of Forest Stands)

  • 이희재;김승욱;최혜영
    • 한국산림과학회지
    • /
    • 제112권3호
    • /
    • pp.363-373
    • /
    • 2023
  • 산림 바이오매스 조사는 탄소흡수원으로서의 산림을 평가하고 관리하기 위해 주기적으로 수행된다. 원격탐사의 한 종류인 라이다는 적은 노동력으로 객관적인 산림 구조 정보를 획득할 수 있어, 최근 라이다(LiDAR, Light Detection and Ranging)를 이용한 산림 조사가 주목받고 있다. 본 연구에서는 임분 상하층 바이오매스 추정에 백팩형 라이다(Backpack Laser Scanning, BPLS)와 드론 라이다(Unmanned Aerial Vehicle Laser Scanning, UAV-LS)를 이용하는 방법을 제시하고 그 정확도를 평가하였다. 상층의 경우 BPLS와 UAV-LS의 흉고직경과 수고 추정 정확도를 분석하였고, 하층의 경우 BPLS 데이터에서 추출한 수직구조 변수 중 최상의 변수 조합으로 하층 바이오매스를 추정하는 다중회귀모델을 개발하였다. 그 결과, BPLS는 흉고직경을 높은 정확도로 추정하였지만(R2 =0.92) 수고는 과소 추정하였다(R2 =0.63, Bias=-5.56 m). UAV-LS는 BPLS보다 더 높은 수고 추정 정확도를 보였다(R2 =0.91). 하층의 경우 점들의 평균 높이와 라이다 데이터를 같은 높이를 가진 10개의 층으로 나누었을 때 아래에서 네 번째 층의 점 밀도를 의미하는 변수가 선택되어 모델이 개발되었으며, 교차검증 결과 결정계수 값은 0.68로 나타났다. 본 연구의 결과는 BPLS와 UAV-LS를 이용한 임분의 상하층 바이오매스 조사 방법이 기존의 조사 방식을 효과적으로 대체할 수 있음을 시사한다.

텍스트 분류를 위한 자질 순위화 기법에 관한 연구 (An Experimental Study on Feature Ranking Schemes for Text Classification)

  • 김판준
    • 정보관리학회지
    • /
    • 제40권1호
    • /
    • pp.1-21
    • /
    • 2023
  • 본 연구는 텍스트 분류를 위한 효율적인 자질선정 방법으로 자질 순위화 기법의 성능을 구체적으로 검토하였다. 지금까지 자질 순위화 기법은 주로 문헌빈도에 기초한 경우가 대부분이며, 상대적으로 용어빈도를 사용한 경우는 많지 않았다. 따라서 텍스트 분류를 위한 자질선정 방법으로 용어빈도와 문헌빈도를 개별적으로 적용한 단일 순위화 기법들의 성능을 살펴본 다음, 양자를 함께 사용하는 조합 순위화 기법의 성능을 검토하였다. 구체적으로 두 개의 실험 문헌집단(Reuters-21578, 20NG)과 5개 분류기(SVM, NB, ROC, TRA, RNN)를 사용하는 환경에서 분류 실험을 진행하였고, 결과의 신뢰성 확보를 위해 5-fold cross validation과 t-test를 적용하였다. 결과적으로, 단일 순위화 기법으로는 문헌빈도 기반의 단일 순위화 기법(chi)이 전반적으로 좋은 성능을 보였다. 또한, 최고 성능의 단일 순위화 기법과 조합 순위화 기법 간에는 유의한 성능 차이가 없는 것으로 나타났다. 따라서 충분한 학습문헌을 확보할 수 있는 환경에서는 텍스트 분류의 자질선정 방법으로 문헌빈도 기반의 단일 순위화 기법(chi)을 사용하는 것이 보다 효율적이라 할 수 있다.

사용자 감정 예측을 통한 상황인지 추천시스템의 개선 (Improvement of a Context-aware Recommender System through User's Emotional State Prediction)

  • 안현철
    • Journal of Information Technology Applications and Management
    • /
    • 제21권4호
    • /
    • pp.203-223
    • /
    • 2014
  • This study proposes a novel context-aware recommender system, which is designed to recommend the items according to the customer's responses to the previously recommended item. In specific, our proposed system predicts the user's emotional state from his or her responses (such as facial expressions and movements) to the previous recommended item, and then it recommends the items that are similar to the previous one when his or her emotional state is estimated as positive. If the customer's emotional state on the previously recommended item is regarded as negative, the system recommends the items that have characteristics opposite to the previous item. Our proposed system consists of two sub modules-(1) emotion prediction module, and (2) responsive recommendation module. Emotion prediction module contains the emotion prediction model that predicts a customer's arousal level-a physiological and psychological state of being awake or reactive to stimuli-using the customer's reaction data including facial expressions and body movements, which can be measured using Microsoft's Kinect Sensor. Responsive recommendation module generates a recommendation list by using the results from the first module-emotion prediction module. If a customer shows a high level of arousal on the previously recommended item, the module recommends the items that are most similar to the previous item. Otherwise, it recommends the items that are most dissimilar to the previous one. In order to validate the performance and usefulness of the proposed recommender system, we conducted empirical validation. In total, 30 undergraduate students participated in the experiment. We used 100 trailers of Korean movies that had been released from 2009 to 2012 as the items for recommendation. For the experiment, we manually constructed Korean movie trailer DB which contains the fields such as release date, genre, director, writer, and actors. In order to check if the recommendation using customers' responses outperforms the recommendation using their demographic information, we compared them. The performance of the recommendation was measured using two metrics-satisfaction and arousal levels. Experimental results showed that the recommendation using customers' responses (i.e. our proposed system) outperformed the recommendation using their demographic information with statistical significance.

Optimization of Multi-Atlas Segmentation with Joint Label Fusion Algorithm for Automatic Segmentation in Prostate MR Imaging

  • Choi, Yoon Ho;Kim, Jae-Hun;Kim, Chan Kyo
    • Investigative Magnetic Resonance Imaging
    • /
    • 제24권3호
    • /
    • pp.123-131
    • /
    • 2020
  • Purpose: Joint label fusion (JLF) is a popular multi-atlas-based segmentation algorithm, which compensates for dependent errors that may exist between atlases. However, in order to get good segmentation results, it is very important to set the several free parameters of the algorithm to optimal values. In this study, we first investigate the feasibility of a JLF algorithm for prostate segmentation in MR images, and then suggest the optimal set of parameters for the automatic prostate segmentation by validating the results of each parameter combination. Materials and Methods: We acquired T2-weighted prostate MR images from 20 normal heathy volunteers and did a series of cross validations for every set of parameters of JLF. In each case, the atlases were rigidly registered for the target image. Then, we calculated their voting weights for label fusion from each combination of JLF's parameters (rpxy, rpz, rsxy, rsz, β). We evaluated the segmentation performances by five validation metrics of the Prostate MR Image Segmentation challenge. Results: As the number of voxels participating in the voting weight calculation and the number of referenced atlases is increased, the overall segmentation performance is gradually improved. The JLF algorithm showed the best results for dice similarity coefficient, 0.8495 ± 0.0392; relative volume difference, 15.2353 ± 17.2350; absolute relative volume difference, 18.8710 ± 13.1546; 95% Hausdorff distance, 7.2366 ± 1.8502; and average boundary distance, 2.2107 ± 0.4972; in parameters of rpxy = 10, rpz = 1, rsxy = 3, rsz = 1, and β = 3. Conclusion: The evaluated results showed the feasibility of the JLF algorithm for automatic segmentation of prostate MRI. This empirical analysis of segmentation results by label fusion allows for the appropriate setting of parameters.