• Title/Summary/Keyword: fused lasso

Search Result 9, Processing Time 0.019 seconds

Comparison of Lasso Type Estimators for High-Dimensional Data

  • Kim, Jaehee
    • Communications for Statistical Applications and Methods
    • /
    • v.21 no.4
    • /
    • pp.349-361
    • /
    • 2014
  • This paper compares of lasso type estimators in various high-dimensional data situations with sparse parameters. Lasso, adaptive lasso, fused lasso and elastic net as lasso type estimators and ridge estimator are compared via simulation in linear models with correlated and uncorrelated covariates and binary regression models with correlated covariates and discrete covariates. Each method is shown to have advantages with different penalty conditions according to sparsity patterns of regression parameters. We applied the lasso type methods to Arabidopsis microarray gene expression data to find the strongly significant genes to distinguish two groups.

Detection of multiple change points using penalized least square methods: a comparative study between ℓ0 and ℓ1 penalty (벌점-최소제곱법을 이용한 다중 변화점 탐색)

  • Son, Won;Lim, Johan;Yu, Donghyeon
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.6
    • /
    • pp.1147-1154
    • /
    • 2016
  • In this paper, we numerically compare two penalized least square methods, the ${\ell}_0$-penalized method and the fused lasso regression (FLR, ${\ell}_1$ penalization), in finding multiple change points of a signal. We find that the ${\ell}_0$-penalized method performs better than the FLR, which produces many false detections in some cases as the theory tells. In addition, the computation of ${\ell}_0$-penalized method relies on dynamic programming and is as efficient as the FLR.

Spatial Clustering Method Via Generalized Lasso (Generalized Lasso를 이용한 공간 군집 기법)

  • Song, Eunjung;Choi, Hosik;Hwang, Seungsik;Lee, Woojoo
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.4
    • /
    • pp.561-575
    • /
    • 2014
  • In this paper, we propose a penalized likelihood method to detect local spatial clusters associated with disease. The key computational algorithm is based on genlasso by Tibshirani and Taylor (2011). The proposed method has two main advantages over Kulldorff's method which is popoular to detect local spatial clusters. First, it is not needed to specify a proper cluster size a priori. Second, any type of covariate can be incorporated and, it is possible to find local spatial clusters adjusted for some demographic variables. We illustrate our proposed method using tuberculosis data from Seoul.

Genomic Selection for Adjacent Genetic Markers of Yorkshire Pigs Using Regularized Regression Approaches

  • Park, Minsu;Kim, Tae-Hun;Cho, Eun-Seok;Kim, Heebal;Oh, Hee-Seok
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.27 no.12
    • /
    • pp.1678-1683
    • /
    • 2014
  • This study considers a problem of genomic selection (GS) for adjacent genetic markers of Yorkshire pigs which are typically correlated. The GS has been widely used to efficiently estimate target variables such as molecular breeding values using markers across the entire genome. Recently, GS has been applied to animals as well as plants, especially to pigs. For efficient selection of variables with specific traits in pig breeding, it is required that any such variable selection retains some properties: i) it produces a simple model by identifying insignificant variables; ii) it improves the accuracy of the prediction of future data; and iii) it is feasible to handle high-dimensional data in which the number of variables is larger than the number of observations. In this paper, we applied several variable selection methods including least absolute shrinkage and selection operator (LASSO), fused LASSO and elastic net to data with 47K single nucleotide polymorphisms and litter size for 519 observed sows. Based on experiments, we observed that the fused LASSO outperforms other approaches.

Investigating spatial clusters of single-person households and low-income elderly single-person using penalized likelihood (벌칙가능도함수를 이용한 1인가구와 저소득 독거노인의 공간군집 탐색)

  • Song, Eunjung;Lee, Woojoo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.6
    • /
    • pp.1257-1260
    • /
    • 2017
  • Single-person households recently have been rapidly increasing and one reason may be the increment in elderly single-person. Since the change of living patterns is relevant to the government policy direction, it is important to understand how single-person households are clustered and which factors have influence on them. In this study, we tried to detect spatial clusters of single-person households and low-income elderly single-person households after adjusting for deprivation index. A recently developed fused lasso for Poisson data was used for data analysis and we provided the details on how to use it in R. From these analysis results, we observed the effect of socioeconomic level on the clusters and explained the reason why spatial clusters are shown after adjusting for deprivation index.

Improvement of inspection system for common crossings by track side monitoring and prognostics

  • Sysyn, Mykola;Nabochenko, Olga;Kovalchuk, Vitalii;Gruen, Dimitri;Pentsak, Andriy
    • Structural Monitoring and Maintenance
    • /
    • v.6 no.3
    • /
    • pp.219-235
    • /
    • 2019
  • Scheduled inspections of common crossings are one of the main cost drivers of railway maintenance. Prognostics and health management (PHM) approach and modern monitoring means offer many possibilities in the optimization of inspections and maintenance. The present paper deals with data driven prognosis of the common crossing remaining useful life (RUL) that is based on an inertial monitoring system. The problem of scheduled inspections system for common crossings is outlined and analysed. The proposed analysis of inertial signals with the maximal overlap discrete wavelet packet transform (MODWPT) and Shannon entropy (SE) estimates enable to extract the spectral features. The relevant features for the acceleration components are selected with application of Lasso (Least absolute shrinkage and selection operator) regularization. The features are fused with time domain information about the longitudinal position of wheels impact and train velocities by multivariate regression. The fused structural health (SH) indicator has a significant correlation to the lifetime of crossing. The RUL prognosis is performed on the linear degradation stochastic model with recursive Bayesian update. Prognosis testing metrics show the promising results for common crossing inspection scheduling improvement.

Efficient Compression Algorithm with Limited Resource for Continuous Surveillance

  • Yin, Ling;Liu, Chuanren;Lu, Xinjiang;Chen, Jiafeng;Liu, Caixing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.11
    • /
    • pp.5476-5496
    • /
    • 2016
  • Energy efficiency of resource-constrained wireless sensor networks is critical in applications such as real-time monitoring/surveillance. To improve the energy efficiency and reduce the energy consumption, the time series data can be compressed before transmission. However, most of the compression algorithms for time series data were developed only for single variate scenarios, while in practice there are often multiple sensor nodes in one application and the collected data is actually multivariate time series. In this paper, we propose to compress the time series data by the Lasso (least absolute shrinkage and selection operator) approximation. We show that, our approach can be naturally extended for compressing the multivariate time series data. Our extension is novel since it constructs an optimal projection of the original multivariates where the best energy efficiency can be realized. The two algorithms are named by ULasso (Univariate Lasso) and MLasso (Multivariate Lasso), for which we also provide practical guidance for parameter selection. Finally, empirically evaluation is implemented with several publicly available real-world data sets from different application domains. We quantify the algorithm performance by measuring the approximation error, compression ratio, and computation complexity. The results show that ULasso and MLasso are superior to or at least equivalent to compression performance of LTC and PLAMlis. Particularly, MLasso can significantly reduce the smooth multivariate time series data, without breaking the major trends and important changes of the sensor network system.

Permutation test for a post selection inference of the FLSA (순열검정을 이용한 FLSA의 사후추론)

  • Choi, Jieun;Son, Won
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.6
    • /
    • pp.863-874
    • /
    • 2021
  • In this paper, we propose a post-selection inference procedure for the fused lasso signal approximator (FLSA). The FLSA finds underlying sparse piecewise constant mean structure by applying total variation (TV) semi-norm as a penalty term. However, it is widely known that this convex relaxation can cause asymptotic inconsistency in change points detection. As a result, there can remain false change points even though we try to find the best subset of change points via a tuning procedure. To remove these false change points, we propose a post-selection inference for the FLSA. The proposed procedure applies a permutation test based on CUSUM statistic. Our post-selection inference procedure is an extension of the permutation test of Antoch and Hušková (2001) which deals with single change point problems, to multiple change points detection problems in combination with the FLSA. Numerical study results show that the proposed procedure is better than naïve z-tests and tests based on the limiting distribution of CUSUM statistics.

An empirical evidence of inconsistency of the ℓ1 trend filtering in change point detection (1 추세필터의 변화점 식별에 있어서의 비일치성)

  • Yu, Donghyeon;Lim, Johan;Son, Won
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.3
    • /
    • pp.371-384
    • /
    • 2022
  • The fused LASSO signal approximator (FLSA) can be applied to find change points from the data having piecewise constant mean structure. It is well-known that the FLSA is inconsistent in change points detection. This inconsistency is due to a total-variation denoising penalty of the FLSA. ℓ1 trend filter, one of the popular tools for finding an underlying trend from data, can be used to identify change points of piecewise linear trends. Since the ℓ1 trend filter applies the sum of absolute values of slope differences, it can be inconsistent for change points recovery as the FLSA. However, there are few studies on the inconsistency of the ℓ1 trend filtering. In this paper, we demonstrate the inconsistency of the ℓ1 trend filtering with a numerical study.