• Title/Summary/Keyword: Sample selection model

Search Result 197, Processing Time 0.026 seconds

A Novel Text Sample Selection Model for Scene Text Detection via Bootstrap Learning

  • Kong, Jun;Sun, Jinhua;Jiang, Min;Hou, Jian
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.771-789
    • /
    • 2019
  • Text detection has been a popular research topic in the field of computer vision. It is difficult for prevalent text detection algorithms to avoid the dependence on datasets. To overcome this problem, we proposed a novel unsupervised text detection algorithm inspired by bootstrap learning. Firstly, the text candidate in a novel form of superpixel is proposed to improve the text recall rate by image segmentation. Secondly, we propose a unique text sample selection model (TSSM) to extract text samples from the current image and eliminate database dependency. Specifically, to improve the precision of samples, we combine maximally stable extremal regions (MSERs) and the saliency map to generate sample reference maps with a double threshold scheme. Finally, a multiple kernel boosting method is developed to generate a strong text classifier by combining multiple single kernel SVMs based on the samples selected from TSSM. Experimental results on standard datasets demonstrate that our text detection method is robust to complex backgrounds and multilingual text and shows stable performance on different standard datasets.

Bayesian information criterion accounting for the number of covariance parameters in mixed effects models

  • Heo, Junoh;Lee, Jung Yeon;Kim, Wonkuk
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.3
    • /
    • pp.301-311
    • /
    • 2020
  • Schwarz's Bayesian information criterion (BIC) is one of the most popular criteria for model selection, that was derived under the assumption of independent and identical distribution. For correlated data in longitudinal studies, Jones (Statistics in Medicine, 30, 3050-3056, 2011) modified the BIC to select the best linear mixed effects model based on the effective sample size where the number of parameters in covariance structure was not considered. In this paper, we propose an extended Jones' modified BIC by considering covariance parameters. We conducted simulation studies under a variety of parameter configurations for linear mixed effects models. Our simulation study indicates that our proposed BIC performs better in model selection than Schwarz's BIC and Jones' modified BIC do in most scenarios. We also illustrate an example of smoking data using a longitudinal cohort of cancer patients.

NEW SELECTION APPROACH FOR RESOLUTION AND BASIS FUNCTIONS IN WAVELET REGRESSION

  • Park, Chun Gun
    • Korean Journal of Mathematics
    • /
    • v.22 no.2
    • /
    • pp.289-305
    • /
    • 2014
  • In this paper we propose a new approach to the variable selection problem for a primary resolution and wavelet basis functions in wavelet regression. Most wavelet shrinkage methods focus on thresholding the wavelet coefficients, given a primary resolution which is usually determined by the sample size. However, both a primary resolution and the basis functions are affected by the shape of an unknown function rather than the sample size. Unlike existing methods, our method does not depend on the sample size and also takes into account the shape of the unknown function.

Forecasting the Baltic Dry Index Using Bayesian Variable Selection (베이지안 변수선택 기법을 이용한 발틱건화물운임지수(BDI) 예측)

  • Xiang-Yu Han;Young Min Kim
    • Korea Trade Review
    • /
    • v.47 no.5
    • /
    • pp.21-37
    • /
    • 2022
  • Baltic Dry Index (BDI) is difficult to forecast because of the high volatility and complexity. To improve the BDI forecasting ability, this study apply Bayesian variable selection method with a large number of predictors. Our estimation results based on the BDI and all predictors from January 2000 to September 2021 indicate that the out-of-sample prediction ability of the ADL model with the variable selection is superior to that of the AR model in terms of point and density forecasting. We also find that critical predictors for the BDI change over forecasts horizon. The lagged BDI are being selected as an key predictor at all forecasts horizon, but commodity price, the clarksea index, and interest rates have additional information to predict BDI at mid-term horizon. This implies that time variations of predictors should be considered to predict the BDI.

An Application of Heckman Two-step Procedure to Management Accounting and Firm Effectiveness: An Empirical Study from Vietnam

  • HUYNH, Quang Linh
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.9 no.2
    • /
    • pp.347-353
    • /
    • 2022
  • Using the Heckman two-step procedure, this study investigates the relationship between management accounting implementation and firm effectiveness. The research data for this study was acquired from 450 publicly traded companies in Vietnam; however, the final sample only includes 304 responses containing useful information. The reliability analysis was used to evaluate the acquired data to examine the qualities of constructs and the dimensions that make them up. Then, the Heckman two-step technique was performed to analyze the causal connection from the acceptance of management accounting to firm effectiveness allowing for the effect of environmental uncertainty and organizational characteristics on the likelihood of adopting management accounting. The empirical findings show that management accounting acceptance determines firm effectiveness; however, the research model on the relationship between management accounting adoption and firm effectiveness has a sample selection bias. The main conclusions of this study are that there is a difference in the effects of management accounting adoption on business effectiveness when sample selection bias is not taken into consideration. When potential sample selection bias is taken into account by integrating environmental uncertainty and organizational characteristics in the research model, the effect of adopting management accounting on company effectiveness becomes minor.

Who Are Domestic Travel Agency Users and Who Buys Full Package Trips? A Study of Korean Outbound Travelers

  • AHN, Young-Joo;LEE, Seul Ki;AHN, Yoon-Young
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.6 no.4
    • /
    • pp.147-158
    • /
    • 2019
  • The purpose of this study is to identify differences based on demographic characteristics and travel-related characteristics: first, whether travelers used a domestic travel agency and second whether travelers purchased a full-package travel program. A sample selection probit model was used to provide simultaneous evaluation of the different characteristics of outbound travelers. The present study investigates how tourists make decisions based on two travel-pattern choices. It then goes on to explore the characteristics of outbound travelers from South Korea. The data is drawn from a nationwide survey in South Korea, and a total of 859 surveys were used for analysis. Due to the interdependent nature of the choices, a sample selection probit model was used to estimate outbound tourists' use of domestic travel agency and purchase of full travel package. Significant determinants of domestic travel agency use are identified as age, gender, marital status, party size, children, length of travel, and travel distance, while those of full travel package purchase are age, marital status, and travel purpose. Estimated results provide manifestations of differing travel needs of outbound travelers. the results of this study demonstrate differences between travel-agency users and full-package travel-program consumers and provide determinants that affect the purchase of full-package travel.

Analyzing the Determinants of Online Seafood Purchasing Using Heckman's Ordered Probit Sample-Selection Model (Heckman 순서형 프로빗 모형을 이용한 소비자의 온라인 수산물 구매 결정요인 분석)

  • Heon-Dong Lee
    • The Journal of Fisheries Business Administration
    • /
    • v.55 no.1
    • /
    • pp.37-53
    • /
    • 2024
  • In the post-COVID-19, the food industry is rapidly reshaping its market structure toward online distribution. Rapid delivery system driven by large distribution platforms has ushered in an era of online distribution of fresh seafood that was previously limited. This study surveyed 1,000 consumers nationwide to determine their online seafood purchasing behaviors. The research methodology used factor analysis of consumer lifestyle and Heckman's ordered probit sample-selection model. The main results of the analysis are as follows. First, quality, freshness, selling price, product reviews from other buyers, and convenience are particularly important considerations when consumers purchase seafood from online shopping. Second, online retailers and the government must prepare measures to expand seafood consumption by considering household characteristics and consumer lifestyles. Third, it was analyzed that consumers trust the quality and safety of seafood distributed online platforms. It is not possible to provide purchase incentives to consumers who consider value consumption important, so improvement measures are needed. The results of this study are expected to provide implications on consumer preferences to online platforms, seafood companies, and producers, and can be used to establish future marketing strategies.

A Bayesian Test for Simple Tree Ordered Alternative using Intrinsic Priors

  • Kim, Seong W.
    • Journal of the Korean Statistical Society
    • /
    • v.28 no.1
    • /
    • pp.73-92
    • /
    • 1999
  • In Bayesian model selection or testing problems, one cannot utilize standard or default noninformative priors, since these priors are typically improper and are defined only up to arbitrary constants. The resulting Bayes factors are not well defined. A recently proposed model selection criterion, the intrinsic Bayes factor overcomes such problems by using a part of the sample as a training sample to get a proper posterior and then use the posterior as the prior for the remaining observations to compute the Bayes factor. Surprisingly, such Bayes factor can also be computed directly from the full sample by some proper priors, namely intrinsic priors. The present paper explains how to derive intrinsic priors for simple tree ordered exponential means. Some numerical results are also provided to support theoretical results and compare with classical methods.

  • PDF

On the Model Selection Criteria in Normal Distributions

  • Chung, Han-Yeong;Lee, Kee-Won
    • Journal of the Korean Statistical Society
    • /
    • v.21 no.2
    • /
    • pp.93-110
    • /
    • 1992
  • A model selection approach is used to find out whether the mean and the variance of a unique sample are different from the pre-specified values. Normal distribution is selected as an approximating model. Kullback-Leibler discrepancy comes out as a natural measure of discrepancy between the operating model and the approximating model. Several estimates of selection criterion are computed including AIC, TIC, and a coupleof bootstrap estimator of the selection criterion are considered according to the way of resampling. It is shown that a closed form expression is available for the parametric bootstrap estimated cirterion. A Monte Carlo study is provided to give a formal comparison when the operating family itself is normally distributed.

  • PDF

Least absolute deviation estimator based consistent model selection in regression

  • Shende, K.S.;Kashid, D.N.
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.3
    • /
    • pp.273-293
    • /
    • 2019
  • We consider the problem of model selection in multiple linear regression with outliers and non-normal error distributions. In this article, the robust model selection criterion is proposed based on the robust estimation method with the least absolute deviation (LAD). The proposed criterion is shown to be consistent. We suggest proposed criterion based algorithms that are suitable for a large number of predictors in the model. These algorithms select only relevant predictor variables with probability one for large sample sizes. An exhaustive simulation study shows that the criterion performs well. However, the proposed criterion is applied to a real data set to examine its applicability. The simulation results show the proficiency of algorithms in the presence of outliers, non-normal distribution, and multicollinearity.