• Title/Summary/Keyword: linear probability model

Search Result 225, Processing Time 0.023 seconds

High-dimensional linear discriminant analysis with moderately clipped LASSO

  • Chang, Jaeho;Moon, Haeseong;Kwon, Sunghoon
    • Communications for Statistical Applications and Methods
    • /
    • v.28 no.1
    • /
    • pp.21-37
    • /
    • 2021
  • There is a direct connection between linear discriminant analysis (LDA) and linear regression since the direction vector of the LDA can be obtained by the least square estimation. The connection motivates the penalized LDA when the model is high-dimensional where the number of predictive variables is larger than the sample size. In this paper, we study the penalized LDA for a class of penalties, called the moderately clipped LASSO (MCL), which interpolates between the least absolute shrinkage and selection operator (LASSO) and minimax concave penalty. We prove that the MCL penalized LDA correctly identifies the sparsity of the Bayes direction vector with probability tending to one, which is supported by better finite sample performance than LASSO based on concrete numerical studies.

Linear prediction and z-transform based CDF-mapping simulation algorithm of multivariate non-Gaussian fluctuating wind pressure

  • Jiang, Lei;Li, Chunxiang;Li, Jinhua
    • Wind and Structures
    • /
    • v.31 no.6
    • /
    • pp.549-560
    • /
    • 2020
  • Methods for stochastic simulation of non-Gaussian wind pressure have increasingly addressed the efficiency and accuracy contents to offer an accurate description of the extreme value estimation of the long-span and high-rise structures. This paper presents a linear prediction and z-transform (LPZ) based Cumulative distribution function (CDF) mapping algorithm for the simulation of multivariate non-Gaussian fluctuating wind pressure. The new algorithm generates realizations of non-Gaussian with prescribed marginal probability distribution function (PDF) and prescribed spectral density function (PSD). The inverse linear prediction and z-transform function (ILPZ) is deduced. LPZ is improved and applied to non-Gaussian wind pressure simulation for the first time. The new algorithm is demonstrated to be efficient, flexible, and more accurate in comparison with the FFT-based method and Hermite polynomial model method in two examples for transverse softening and longitudinal hardening non-Gaussian wind pressures.

Comparing the efficiency of dispersion parameter estimators in gamma generalized linear models (감마 일반화 선형 모형에서의 산포 모수 추정량에 대한 효율성 연구)

  • Jo, Seongil;Lee, Woojoo
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.1
    • /
    • pp.95-102
    • /
    • 2017
  • Gamma generalized linear models have received less attention than Poisson and binomial generalized linear models. Therefore, many old-established statistical techniques are still used in gamma generalized linear models. In particular, existing literature and textbooks still use approximate estimates for the dispersion parameter. In this paper we study the efficiency of various dispersion parameter estimators in gamma generalized linear models and perform numerical simulations. Numerical studies show that the maximum likelihood estimator and Cox-Reid adjusted maximum likelihood estimator are recommended and that approximate estimates should be avoided in practice.

Language Model Adaptation Based on Topic Probability of Latent Dirichlet Allocation

  • Jeon, Hyung-Bae;Lee, Soo-Young
    • ETRI Journal
    • /
    • v.38 no.3
    • /
    • pp.487-493
    • /
    • 2016
  • Two new methods are proposed for an unsupervised adaptation of a language model (LM) with a single sentence for automatic transcription tasks. At the training phase, training documents are clustered by a method known as Latent Dirichlet allocation (LDA), and then a domain-specific LM is trained for each cluster. At the test phase, an adapted LM is presented as a linear mixture of the now trained domain-specific LMs. Unlike previous adaptation methods, the proposed methods fully utilize a trained LDA model for the estimation of weight values, which are then to be assigned to the now trained domain-specific LMs; therefore, the clustering and weight-estimation algorithms of the trained LDA model are reliable. For the continuous speech recognition benchmark tests, the proposed methods outperform other unsupervised LM adaptation methods based on latent semantic analysis, non-negative matrix factorization, and LDA with n-gram counting.

Studies on the Stochastic Generation of Synthetic Streamflow Sequences(I) -On the Simulation Models of Streamflow- (하천유량의 추계학적 모의발생에 관한 연구(I) -하천유량의 Simulation 모델에 대하여-)

  • 이순탁
    • Water for future
    • /
    • v.7 no.1
    • /
    • pp.71-77
    • /
    • 1974
  • This paper reviews several different single site generation models for further development of a model for generating the Synthetic sequences of streamflow in the continuous streams like main streams in Korea. Initially the historical time series is looked using a time series technique, that is correlograms, to determine whether a lag one Markov model will satisfactorily represent the historical data. The single site models which were examined include an empirical model using the historical probability distribution of the random component, the linear autoregressive model(Markov model, or Thomas-Fiering model) using both logarithms of the data and Matala's log-normal transformation equations, and finally gamma distribution model.

  • PDF

Evaluation of seismic fragility models for cut-and-cover railway tunnels (개착식 철도 터널 구조물의 기존 지진취약도 모델 적합성 평가)

  • Yang, Seunghoon;Kwak, Dongyoup
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.24 no.1
    • /
    • pp.1-13
    • /
    • 2022
  • A weighted linear combination of seismic fragility models previously developed for cut-and-cover railway tunnels was presented and the appropriateness of the combined model was evaluated. The seismic fragility function is expressed in the form of a cumulative probability function of the lognormal distribution based on the peak ground acceleration. The model uncertainty can be reduced by combining models independently developed. Equal weight is applied to four models. The new seismic fragility function was developed for each damage level by determining the median and standard deviation, which are model metrics. Comparing fragility curves developed for other bored tunnels, cut-and-cover tunnels for high-speed railway system have a similar level of fragility. We postulated that this is due to the high seismic design standard for high-speed railway tunnel.

Compiler Analysis Framework Using SVM-Based Genetic Algorithm : Feature and Model Selection Sensitivity (SVM 기반 유전 알고리즘을 이용한 컴파일러 분석 프레임워크 : 특징 및 모델 선택 민감성)

  • Hwang, Cheol-Hun;Shin, Gun-Yoon;Kim, Dong-Wook;Han, Myung-Mook
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.4
    • /
    • pp.537-544
    • /
    • 2020
  • Advances in detection techniques, such as mutation and obfuscation, are being advanced with the development of malware technology. In the malware detection technology, unknown malware detection technology is important, and a method for Malware Authorship Attribution that detects an unknown malicious code by identifying the author through distributed malware is being studied. In this paper, we try to extract the compiler information affecting the binary-based author identification method and to investigate the sensitivity of feature selection, probability and non-probability models, and optimization to classification efficiency between studies. In the experiment, the feature selection method through information gain and the support vector machine, which is a non-probability model, showed high efficiency. Among the optimization studies, high classification accuracy was obtained through feature selection and model optimization through the proposed framework, and resulted in 48% feature reduction and 53 faster execution speed. Through this study, we can confirm the sensitivity of feature selection, model, and optimization methods to classification efficiency.

Development of Vehicular Load Model using Heavy Truck Weight Distribution (I) - Data Collection and Estimation of Single Truck Weight (중차량중량분포를 이용한 차량하중모형 개발(I) - 자료수집 및 단일차량 최대중량 예측)

  • Hwang, Eui-Seung
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.29 no.3A
    • /
    • pp.189-197
    • /
    • 2009
  • In this study, truck weight data and load effects of single truck on bridges are analyzed for development of new vehicular load model of the reliability-based bridge design code. Rational load model and statistical properties of loads are important for developing reliability-based design code. In this study, truck weight data collected at four locations are used as well as data from four locations in other studies. Truck weight data are collected from WIM or BWIM system, which are known to give reliable data. Typical truck types, dimensions and axle weight distribution are determined. Probability distributions of upper 20% total truck weight are assumed as Extreme Type I and 100 years maximum truck weights are estimated by linear regression on the probability paper. The load effects of trucks having estimated maximum weights are analyzed for span length from 10 m to 200 m.

Autocovariance based estimation in the linear regression model (선형회귀 모형에서 자기공분산 기반 추정)

  • Park, Cheol-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.5
    • /
    • pp.839-847
    • /
    • 2011
  • In this study, we derive an estimator based on autocovariance for the regression coefficients vector in the multiple linear regression model. This method is suggested by Park (2009), and although this method does not seem to be intuitively attractive, this estimator is unbiased for the regression coefficients vector. When the vectors of exploratory variables satisfy some regularity conditions, under mild conditions which are satisfied when errors are from autoregressive and moving average models, this estimator has asymptotically the same distribution as the least squares estimator and also converges in probability to the regression coefficients vector. Finally we provide a simulation study that the forementioned theoretical results hold for small sample cases.

General picture of co-nonsolvency for linear and ring polymers

  • Park, Gyehyun;Lee, Eunsang;Jung, YounJoon
    • Proceeding of EDISON Challenge
    • /
    • 2016.03a
    • /
    • pp.147-154
    • /
    • 2016
  • Co-nonsolvency is a puzzling phenomenon that a polymer swells in a good solvent individually, but it collapses in a mixture of good solvents. This structural transition with changing solvent environment has been drawing attention due to practical application for stimuli-responsive polymer. The aim of this work is to describe the physical origin of the co-nonsolvency. In this work, we present Monte Carlo simulations for polymer solutions by using simple and general model. We simulate linear and ring polymers to compare their co-nonsolvency behaviors. Calculating Flory exponents and bridging fractions gives a good description for polymer structures. While the polymer structure shows non-monotonous behavior with increasing the cosolvent fraction, the chemical potential decreases monotonously. This indicates that coil-to-globule transition of polymers is purely controlled by free energy and can be regarded as a thermodynamics transition. We also present that ring polymers have higher looping probability than linear polymers, thus the bridging fraction remains higher at high cosolvent fraction. Our study provides a new perspective to understand polymer structure when the polymer "dissolves well" in any solvent.

  • PDF