• Title/Summary/Keyword: Improvement of prediction performance

Search Result 440, Processing Time 0.029 seconds

Improving Clustering Performance Using Gene Ontology (유전자 온톨로지를 활용한 클러스터링 성능 향상 기법)

  • Ko, Song;Kang, Bo-Yeong;Kim, Dae-Won
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.6
    • /
    • pp.802-808
    • /
    • 2009
  • Recently many researches have been presented to improve the clustering performance of gene expression data by incorporating Gene Ontology into the process of clustering. In particular, Kustra et al. showed higher performance improvement by exploiting Biological Process Ontology compared to the typical expression-based clustering. This paper extends the work of Kustra et al. by performing extensive experiments on the way of incorporating GO structures. To this end, we used three ontological distance measures (Lin's, Resnik's, Jiang's) and three GO structures (BP, CC, MF) for the yeast expression data. From all test cases, We found that clustering performances were remarkably improved by incorporating GO; especially, Resnik's distance measure based on Biological Process Ontology was the best.

Improving Performance of Recommendation Systems Using Topic Modeling (사용자 관심 이슈 분석을 통한 추천시스템 성능 향상 방안)

  • Choi, Seongi;Hyun, Yoonjin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.3
    • /
    • pp.101-116
    • /
    • 2015
  • Recently, due to the development of smart devices and social media, vast amounts of information with the various forms were accumulated. Particularly, considerable research efforts are being directed towards analyzing unstructured big data to resolve various social problems. Accordingly, focus of data-driven decision-making is being moved from structured data analysis to unstructured one. Also, in the field of recommendation system, which is the typical area of data-driven decision-making, the need of using unstructured data has been steadily increased to improve system performance. Approaches to improve the performance of recommendation systems can be found in two aspects- improving algorithms and acquiring useful data with high quality. Traditionally, most efforts to improve the performance of recommendation system were made by the former approach, while the latter approach has not attracted much attention relatively. In this sense, efforts to utilize unstructured data from variable sources are very timely and necessary. Particularly, as the interests of users are directly connected with their needs, identifying the interests of the user through unstructured big data analysis can be a crew for improving performance of recommendation systems. In this sense, this study proposes the methodology of improving recommendation system by measuring interests of the user. Specially, this study proposes the method to quantify interests of the user by analyzing user's internet usage patterns, and to predict user's repurchase based upon the discovered preferences. There are two important modules in this study. The first module predicts repurchase probability of each category through analyzing users' purchase history. We include the first module to our research scope for comparing the accuracy of traditional purchase-based prediction model to our new model presented in the second module. This procedure extracts purchase history of users. The core part of our methodology is in the second module. This module extracts users' interests by analyzing news articles the users have read. The second module constructs a correspondence matrix between topics and news articles by performing topic modeling on real world news articles. And then, the module analyzes users' news access patterns and then constructs a correspondence matrix between articles and users. After that, by merging the results of the previous processes in the second module, we can obtain a correspondence matrix between users and topics. This matrix describes users' interests in a structured manner. Finally, by using the matrix, the second module builds a model for predicting repurchase probability of each category. In this paper, we also provide experimental results of our performance evaluation. The outline of data used our experiments is as follows. We acquired web transaction data of 5,000 panels from a company that is specialized to analyzing ranks of internet sites. At first we extracted 15,000 URLs of news articles published from July 2012 to June 2013 from the original data and we crawled main contents of the news articles. After that we selected 2,615 users who have read at least one of the extracted news articles. Among the 2,615 users, we discovered that the number of target users who purchase at least one items from our target shopping mall 'G' is 359. In the experiments, we analyzed purchase history and news access records of the 359 internet users. From the performance evaluation, we found that our prediction model using both users' interests and purchase history outperforms a prediction model using only users' purchase history from a view point of misclassification ratio. In detail, our model outperformed the traditional one in appliance, beauty, computer, culture, digital, fashion, and sports categories when artificial neural network based models were used. Similarly, our model outperformed the traditional one in beauty, computer, digital, fashion, food, and furniture categories when decision tree based models were used although the improvement is very small.

Application of Methods of Management Science in Care Process Management (의료프로세스 관리에 경영과학적 접근방법의 응용)

  • Kim, Tae Hyun
    • Korea Journal of Hospital Management
    • /
    • no.spc
    • /
    • pp.1-13
    • /
    • 2016
  • In a situation where competition becomes intense, health care organizations constantly strive to provide more services with given personnel and time. While not only the 'quantity' of the services but also the 'quality' becomes increasingly important, various problems that can occur during the 'process' of service provision can be effectively managed by applying the methods of management science. In this study, we introduce the cases where the methods of management science can be applied for the management of health care organizations in Korea and abroad. There are many cases where various scenarios for improving the patients' accessibility to the services and for maximizing the efficient use of limited resources are established, and simulation or basic statistical analysis methods are used to solve the problems more systematically or to develop improvement plans. In this study, several exemplary cases, such as no-show of patients, crowding in the emergency room, prediction of the number of available beds in the intensive care units, nurse scheduling, delay of arrival of patients, and ordering of the proper amount of therapeutic materials, are introduced and discussed. From the perspective of administrators or clinicians, however, it may not be easy to master the methodology that requires considerable mathematical background or apply the theories to practice directly. Therefore, it is suggested that more practical and relatively simple analytical methods should be applied. Also, having a more positive attitude toward improving the current performance (e.g., a belief that 'we can always be better than now'), and paying attention to improving the job satisfaction by addressing problems, with experimental spirit and data-driven decision management.

Application of a comparative analysis of random forest programming to predict the strength of environmentally-friendly geopolymer concrete

  • Ying Bi;Yeng Yi
    • Steel and Composite Structures
    • /
    • v.50 no.4
    • /
    • pp.443-458
    • /
    • 2024
  • The construction industry, one of the biggest producers of greenhouse emissions, is under a lot of pressure as a result of growing worries about how climate change may affect local communities. Geopolymer concrete (GPC) has emerged as a feasible choice for construction materials as a result of the environmental issues connected to the manufacture of cement. The findings of this study contribute to the development of machine learning methods for estimating the properties of eco-friendly concrete, which might be used in lieu of traditional concrete to reduce CO2 emissions in the building industry. In the present work, the compressive strength (fc) of GPC is calculated using random forests regression (RFR) methodology where natural zeolite (NZ) and silica fume (SF) replace ground granulated blast-furnace slag (GGBFS). From the literature, a thorough set of experimental experiments on GPC samples were compiled, totaling 254 data rows. The considered RFR integrated with artificial hummingbird optimization (AHA), black widow optimization algorithm (BWOA), and chimp optimization algorithm (ChOA), abbreviated as ARFR, BRFR, and CRFR. The outcomes obtained for RFR models demonstrated satisfactory performance across all evaluation metrics in the prediction procedure. For R2 metric, the CRFR model gained 0.9988 and 0.9981 in the train and test data set higher than those for BRFR (0.9982 and 0.9969), followed by ARFR (0.9971 and 0.9956). Some other error and distribution metrics depicted a roughly 50% improvement for CRFR respect to ARFR.

Estimation of Concrete Strength Using Improved Probabilistic Neural Network Method

  • Kim Doo-Kie;Lee Jong-Jae;Chang Seong-Kyu
    • Journal of the Korea Concrete Institute
    • /
    • v.17 no.6 s.90
    • /
    • pp.1075-1084
    • /
    • 2005
  • The compressive strength of concrete is commonly used criterion in producing concrete. However, the tests on the compressive strength are complicated and time-consuming. More importantly, it is too late to make improvement even if the test result does not satisfy the required strength, since the test is usually performed at the 28th day after the placement of concrete at the construction site. Therefore, accurate and realistic strength estimation before the placement of concrete is being highly required. In this study, the estimation of the compressive strength of concrete was performed by probabilistic neural network(PNN) on the basis of concrete mix proportions. The estimation performance of PNN was improved by considering the correlation between input data and targeted output value. Improved probabilistic neural network was proposed to automatically calculate the smoothing parameter in the conventional PNN by using the scheme of dynamic decay adjustment (DDA) algorithm. The conventional PNN and the PNN with DDA algorithm(IPNN) were applied to predict the compressive strength of concrete using actual test data of two concrete companies. IPNN showed better results than the conventional PNN in predicting the compressive strength of concrete.

A Genetic Algorithm-based Classifier Ensemble Optimization for Activity Recognition in Smart Homes

  • Fatima, Iram;Fahim, Muhammad;Lee, Young-Koo;Lee, Sungyoung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.11
    • /
    • pp.2853-2873
    • /
    • 2013
  • Over the last few years, one of the most common purposes of smart homes is to provide human centric services in the domain of u-healthcare by analyzing inhabitants' daily living. Currently, the major challenges in activity recognition include the reliability of prediction of each classifier as they differ according to smart homes characteristics. Smart homes indicate variation in terms of performed activities, deployed sensors, environment settings, and inhabitants' characteristics. It is not possible that one classifier always performs better than all the other classifiers for every possible situation. This observation has motivated towards combining multiple classifiers to take advantage of their complementary performance for high accuracy. Therefore, in this paper, a method for activity recognition is proposed by optimizing the output of multiple classifiers with Genetic Algorithm (GA). Our proposed method combines the measurement level output of different classifiers for each activity class to make up the ensemble. For the evaluation of the proposed method, experiments are performed on three real datasets from CASAS smart home. The results show that our method systematically outperforms single classifier and traditional multiclass models. The significant improvement is achieved from 0.82 to 0.90 in the F-measures of recognized activities as compare to existing methods.

Machine Learning-based Prediction of Relative Regional Air Volume Change from Healthy Human Lung CTs

  • Eunchan Kim;YongHyun Lee;Jiwoong Choi;Byungjoon Yoo;Kum Ju Chae;Chang Hyun Lee
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.2
    • /
    • pp.576-590
    • /
    • 2023
  • Machine learning is widely used in various academic fields, and recently it has been actively applied in the medical research. In the medical field, machine learning is used in a variety of ways, such as speeding up diagnosis, discovering new biomarkers, or discovering latent traits of a disease. In the respiratory field, a relative regional air volume change (RRAVC) map based on quantitative inspiratory and expiratory computed tomography (CT) imaging can be used as a useful functional imaging biomarker for characterizing regional ventilation. In this study, we seek to predict RRAVC using various regular machine learning models such as extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and multi-layer perceptron (MLP). We experimentally show that MLP performs best, followed by XGBoost. We also propose several relative coordinate systems to minimize intersubjective variability. We confirm a significant experimental performance improvement when we apply a subject's relative proportion coordinates over conventional absolute coordinates.

Machine learning-based Predictive Model of Suicidal Thoughts among Korean Adolescents. (머신러닝 기반 한국 청소년의 자살 생각 예측 모델)

  • YeaJu JIN;HyunKi KIM
    • Journal of Korea Artificial Intelligence Association
    • /
    • v.1 no.1
    • /
    • pp.1-6
    • /
    • 2023
  • This study developed models using decision forest, support vector machine, and logistic regression methods to predict and prevent suicidal ideation among Korean adolescents. The study sample consisted of 51,407 individuals after removing missing data from the raw data of the 18th (2022) Youth Health Behavior Survey conducted by the Korea Centers for Disease Control and Prevention. Analysis was performed using the MS Azure program with Two-Class Decision Forest, Two-Class Support Vector Machine, and Two-Class Logistic Regression. The results of the study showed that the decision forest model achieved an accuracy of 84.8% and an F1-score of 36.7%. The support vector machine model achieved an accuracy of 86.3% and an F1-score of 24.5%. The logistic regression model achieved an accuracy of 87.2% and an F1-score of 40.1%. Applying the logistic regression model with SMOTE to address data imbalance resulted in an accuracy of 81.7% and an F1-score of 57.7%. Although the accuracy slightly decreased, the recall, precision, and F1-score improved, demonstrating excellent performance. These findings have significant implications for the development of prediction models for suicidal ideation among Korean adolescents and can contribute to the prevention and improvement of youth suicide.

Dynamic Model Development and Simulation of Crawler Type Excavator (크롤러형 굴삭기의 동역학적 모델 개발 및 시뮬레이션)

  • Kwon, Soon-Ki
    • Journal of the Korean Society of Manufacturing Technology Engineers
    • /
    • v.18 no.6
    • /
    • pp.642-651
    • /
    • 2009
  • The history of excavator design is not long enough which still causes most of the design considerations to be focused on static analysis or simple functional improvement based on static analysis. However, the real forces experiencing on each component of excavator are highly transient and impulsive. Therefore, the prediction and the evaluation of the movement of the excavator by dynamic load in the early design stage through the dynamic transient analysis of the excavator and ensuring of design technique plays an importance role to reduce development-cost, shorten product-deliver, decrease vehicle-weight and optimize the system design. In this paper, Commercial software DADS and ANSYS help to develop the track model of the crawler type excavator, and to evaluate the performance and the dynamic characteristics of excavator with various simulations. For that reason, the track of crawler type excavator is modelled with DADS Track Vehicle Superelement, and the reaction forces on the track rollers were predicted through the driving simulation. Also, the upper frame and cabin vibration characteristics, at the low RPM idle state, were evaluated with engine rigid body modelling. And flexibility body effects were considered to determine the more accurate joint reaction forces and accelerations under the upper frame swing motion.

  • PDF

Multihop Vehicle-to-Infrastructure Routing Based on the Prediction of Valid Vertices for Vehicular Ad Hoc Networks

  • Shrestha, Raj K.;Moh, Sangman;Chung, IlYong;Shin, Heewook
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.5 no.4
    • /
    • pp.243-253
    • /
    • 2010
  • Multihop data delivery in vehicular ad hoc networks (VANETs) suffers from the fact that vehicles are highly mobile and inter-vehicle links are frequently disconnected. In such networks, for efficient multihop routing of road safety information (e.g. road accident and emergency message) to the area of interest, reliable communication and fast delivery with minimum delay are mandatory. In this paper, we propose a multihop vehicle-to-infrastructure routing protocol named Vertex-Based Predictive Greedy Routing (VPGR), which predicts a sequence of valid vertices (or junctions) from a source vehicle to fixed infrastructure (or a roadside unit) in the area of interest and, then, forwards data to the fixed infrastructure through the sequence of vertices in urban environments. The well known predictive directional greedy routing mechanism is used for data forwarding phase in VPGR. The proposed VPGR leverages the geographic position, velocity, direction and acceleration of vehicles for both the calculation of a sequence of valid vertices and the predictive directional greedy routing. Simulation results show significant performance improvement compared to conventional routing protocols in terms of packet delivery ratio, end-to-end delay and routing overhead.