• Title/Summary/Keyword: rank prediction

Search Result 89, Processing Time 0.024 seconds

The relationship between prediction accuracy and pre-information in collaborative filtering system

  • Kim, Sun-Ok
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.4
    • /
    • pp.803-811
    • /
    • 2010
  • This study analyzes the characteristics of preference ratings by dividing estimated values into four groups according to rank correlation coefficient after obtaining preference estimated value to user's ratings by using collaborative filtering algorithm. It is known that the value of standard error of skewness and standard error of kurtosis lower in the group of higher rank correlation coefficient This explains that the preference of higher rank correlation coefficient has lower extreme values and the differences of preference rating values. In addition, top n recommendation lists are made after obtaining rank fitting by using the result ranks of prediction value and the ranks of real rated values, and this top n is applied to the four groups. The value of top n recommendation is calculated higher in the group of higher rank correlation coefficient, and the recommendation accuracy in the group of higher rank correlation coefficient is higher than that in the group of lower rank correlation coefficient Thus, when using standard error of skewness and standard error of kurtosis in recommender system, rank correlation coefficient can be higher, and so the accuracy of recommendation prediction can be increased.

'Hot Search Keyword' Rank-Change Prediction (인기 검색어의 순위 변화 예측)

  • Kim, Dohyeong;Kang, Byeong Ho;Lee, Sungyoung
    • Journal of KIISE
    • /
    • v.44 no.8
    • /
    • pp.782-790
    • /
    • 2017
  • The service, 'Hot Search Keywords', provides a list of the most hot search terms of different web services such as Naver or Daum. The service, bases the changes in rank of a specific search keyword on changes in its users' interest. This paper introduces a temporal modelling framework for predicting the rank change of hot search keywords using past rank data and machine learning. Past rank data shows that more than 70% of hot search keywords tend to disappear and reappear later. The authors processed missing rank value, using deletion, dummy variables, mean substitution, and expectation maximization. It is however crucial to calculate the optimal window size of the past rank data. We proposed an optimal window size selection approach based on the minimum amount of time a topic within the same or a differing context disappeared. The experiments were conducted with four different machine-learning techniques using the Naver, Daum, and Nate 'Hot Search Keywords' datasets, which were collected for 2 years.

Horse race rank prediction using learning-to-rank approaches (Learning-to-rank 기법을 활용한 서울 경마경기 순위 예측)

  • Junhyoung Chung;Donguk Shin;Seyong Hwang;Gunwoong Park
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.2
    • /
    • pp.239-253
    • /
    • 2024
  • This research applies both point-wise and pair-wise learning strategies within the learning-to-rank (LTR) framework to predict horse race rankings in Seoul. Specifically, for point-wise learning, we employ a linear model and random forest. In contrast, for pair-wise learning, we utilize tools such as RankNet, and LambdaMART (XGBoost Ranker, LightGBM Ranker, and CatBoost Ranker). Furthermore, to enhance predictions, race records are standardized based on race distance, and we integrate various datasets, including race information, jockey information, horse training records, and trainer information. Our results empirically demonstrate that pair-wise learning approaches that can reflect the order information between items generally outperform point-wise learning approaches. Notably, CatBoost Ranker is the top performer. Through Shapley value analysis, we identified that the important variables for CatBoost Ranker include the performance of a horse, its previous race records, the count of its starting trainings, the total number of starting trainings, and the instances of disease diagnoses for the horse.

Identification of Heterogeneous Prognostic Genes and Prediction of Cancer Outcome using PageRank (페이지랭크를 이용한 암환자의 이질적인 예후 유전자 식별 및 예후 예측)

  • Choi, Jonghwan;Ahn, Jaegyoon
    • Journal of KIISE
    • /
    • v.45 no.1
    • /
    • pp.61-68
    • /
    • 2018
  • The identification of genes that contribute to the prediction of prognosis in patients with cancer is one of the challenges in providing appropriate therapies. To find the prognostic genes, several classification models using gene expression data have been proposed. However, the prediction accuracy of cancer prognosis is limited due to the heterogeneity of cancer. In this paper, we integrate microarray data with biological network data using a modified PageRank algorithm to identify prognostic genes. We also predict the prognosis of patients with 6 cancer types (including breast carcinoma) using the K-Nearest Neighbor algorithm. Before we apply the modified PageRank, we separate samples by K-Means clustering to address the heterogeneity of cancer. The proposed algorithm showed better performance than traditional algorithms for prognosis. We were also able to identify cluster-specific biological processes using GO enrichment analysis.

Sensitivity Analysis of Creep and Shrinkage Effects of Prestressed Concrete Bridges (프리스트레스트 콘크리트 교량의 크리프와 건조수축효과의 민감도 해석)

  • 오병환;양인환
    • Proceedings of the Korea Concrete Institute Conference
    • /
    • 1998.10b
    • /
    • pp.656-661
    • /
    • 1998
  • This paper presents a method of statistical analysis and sensitivity analysis of creep and shrinkage effects in PSC box girder bridges. The statistical and sensitivity analyses are performed by using the numerical simulation of Latin Hypercube sampling. For each sample, the time-dependent structural analysis is performed to produce response data, which are then statistically analyzed. The probabilistic prediction of the confidence limits on long-term effects of creep and shrinkage is then expressed. Three measures are examined to quantify the sensitivity of the outputs to each of the input variables. These are rank correlation coefficient(RCC), partial rank correlation coefficient(PRCC) and standardized rank regression coefficient(SRRC) computed on the ranks of the observations. Probability band widens with time, which indicates an increase of prediction uncertainty with time. The creep model uncertainty factor and the relative humidity appear as the most dominant factors with regard to the model output uncertainty.

  • PDF

Statistical and Probabilistic Assessment for the Misorientation Angle of a Grain Boundary for the Precipitation of in a Austenitic Stainless Steel (II) (질화물 우선석출이 발생하는 결정립계 어긋남 각도의 통계 및 확률적 평가 (II))

  • Lee, Sang-Ho;Choe, Byung-Hak;Lee, Tae-Ho;Kim, Sung-Joon;Yoon, Kee-Bong;Kim, Seon-Hwa
    • Korean Journal of Metals and Materials
    • /
    • v.46 no.9
    • /
    • pp.554-562
    • /
    • 2008
  • The distribution and prediction interval for the misorientation angle of grain boundary at which $Cr_2N$ was precipitated during heating at $900^{\circ}C$ for $10^4$ sec were newly estimated, and followed by the estimation of mathematical and median rank methods. The probability density function of the misorientation angle can be estimated by a statistical analysis. And then the ($1-{\alpha}$)100% prediction interval of misorientation angle obtained by the estimated probability density function. If the estimated probability density function was symmetric then a prediction interval for the misorientation angle could be derived by the estimated probability density function. In the case of non-symmetric probability density function, the prediction interval could be obtained from the cumulative distribution function of the estimated probability density function. In this paper, 95, 99 and 99.73% prediction interval obtained by probability density function method and cumulative distribution function method and compared with the former results by median rank regression or mathematical method.

One-month lead dam inflow forecast using climate indices based on tele-connection (원격상관 기후지수를 활용한 1개월 선행 댐유입량 예측)

  • Cho, Jaepil;Jung, Il Won;Kim, Chul Gyium;Kim, Tae Guk
    • Journal of Korea Water Resources Association
    • /
    • v.49 no.5
    • /
    • pp.361-372
    • /
    • 2016
  • Reliable long-term dam inflow prediction is necessary for efficient multi-purpose dam operation in changing climate. Since 2000s the teleconnection between global climate indices (e.g., ENSO) and local hydroclimate regimes have been widely recognized throughout the world. To date many hydrologists focus on predicting future hydrologic conditions using lag teleconnection between streamflow and climate indices. This study investigated the utility of teleconneciton for predicting dam inflow with 1-month lead time at Andong dam basin. To this end 40 global climate indices from NOAA were employed to identify potential predictors of dam inflow, areal averaged precipitation, temperature of Andong dam basin. This study compared three different approaches; 1) dam inflow prediction using SWAT model based on teleconneciton-based precipitation and temperature forecast (SWAT-Forecasted), 2) dam inflow prediction using teleconneciton between dam inflow and climate indices (CIR-Forecasted), and 3) dam inflow prediction based on the rank of current observation in the historical dam inflow (Rank-Observed). Our results demonstrated that CIR-Forecasted showed better predictability than the other approaches, except in December. This is because uncertainties attributed to temporal downscaling from monthly to daily for precipitation and temperature forecasts and hydrologic modeling using SWAT can be ignored from dam inflow forecast through CIR-Forecasted approach. This study indicates that 1-month lead dam inflow forecast based on teleconneciton could provide useful information on Andong dam operation.

Variable Selection with Nonconcave Penalty Function on Reduced-Rank Regression

  • Jung, Sang Yong;Park, Chongsun
    • Communications for Statistical Applications and Methods
    • /
    • v.22 no.1
    • /
    • pp.41-54
    • /
    • 2015
  • In this article, we propose nonconcave penalties on a reduced-rank regression model to select variables and estimate coefficients simultaneously. We apply HARD (hard thresholding) and SCAD (smoothly clipped absolute deviation) symmetric penalty functions with singularities at the origin, and bounded by a constant to reduce bias. In our simulation study and real data analysis, the new method is compared with an existing variable selection method using $L_1$ penalty that exhibits competitive performance in prediction and variable selection. Instead of using only one type of penalty function, we use two or three penalty functions simultaneously and take advantages of various types of penalty functions together to select relevant predictors and estimation to improve the overall performance of model fitting.

Direction-of-Arrival Estimation Using Linear Prediction Method in Conjunction with Signal Enhancement Approach (신호부각법과 결합된 선형예측방법을 이용한 도래각 추정)

  • 오효성
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.10 no.6
    • /
    • pp.959-967
    • /
    • 1999
  • In this paper, we propose a Linear Prediction Method(LPM) in conjunction with signal enhancement for solving the direction-of-arrival estimation problem of multiple incoherent plane waves incident on a uniform linear array. The basic idea of signal enhancement is that of finding the covariance matrix of given rank that lies closest to a given estimated matrix in Frobenius norm sense. It is well known that LPM has a high-resolution performance in general applications, while it provides a lower statistical performance in lower SNR environment. To solve this problem, the LPM combined with signal enhancement approach is herein proposed. Simulation results are illustrated to demonstrate the better performance of the proposed method than conventional LPM.

  • PDF

A Study on the Relation of Top-N Recommendation and the Rank Fitting of Prediction Value through a Improved Collaborative Filtering Algorithm (협력적 필터링 알고리즘의 예측 선호도 순위 일치와 ToP-N 추천에 관한 연구)

  • Lee, Seok-Jun;Lee, Hee-Choon
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.12 no.4
    • /
    • pp.65-73
    • /
    • 2007
  • This study devotes to compare the accuracy of Top-N recommendations of items transacted on the web site for customers with the accuracy of rank conformity of the real ratings with estimated ratings for customers preference about items generated from two types of collaborative filtering algorithms. One is Neighborhood Based Collaborative Filtering Algorithm(NBCFA) and the other is Correspondence Mean Algorithm(CMA). The result of this study shows the accuracy of Top-N recommendations and the rank conformity of real ratings with estimated ratings generated by CMA are better than that of NBCFA. It would be expected that the customer's satisfaction in Recommender System is more improved by using the prediction result from CMA than NBCFA, and then Using CMA in collaborative filtering recommender system is more efficient than using NBCFA.

  • PDF