Search | Korea Science

A Nonparametric Goodness-of-Fit Test for Sparse Multinomial Data

Baek, Jang-Sun
- Journal of the Korean Data and Information Science Society
- /
- v.14 no.2
- /
- pp.303-311
- /
- 2003
We consider the problem of testing cell probabilities in sparse multinomial data. Aerts, et al.(2000) presented $T_1=\sum\limits_{i=1}^k(\hat{p}_i-p_i)^2$ as a test statistic with the local polynomial estimator $(\hat{p}_i$, and showed its asymptotic distribution. When there are cell probabilities with relatively much different sizes, the same contribution of the difference between the estimator and the hypothetical probability at each cell in their test statistic would not be proper to measure the total goodness-of-fit. We consider a Pearson type of goodness-of-fit test statistic, $T=\sum\limits_{i=1}^k(\hat{p}_i-p_i)^2/p_i$ instead, and show it follows an asymptotic normal distribution.
PDF

Sparse-View CT Image Recovery Using Two-Step Iterative Shrinkage-Thresholding Algorithm

Chae, Byung Gyu;Lee, Sooyeul
- ETRI Journal
- /
- v.37 no.6
- /
- pp.1251-1258
- /
- 2015
We investigate an image recovery method for sparse-view computed tomography (CT) using an iterative shrinkage algorithm based on a second-order approach. The two-step iterative shrinkage-thresholding (TwIST) algorithm including a total variation regularization technique is elucidated to be more robust than other first-order methods; it enables a perfect restoration of an original image even if given only a few projection views of a parallel-beam geometry. We find that the incoherency of a projection system matrix in CT geometry sufficiently satisfies the exact reconstruction principle even when the matrix itself has a large condition number. Image reconstruction from fan-beam CT can be well carried out, but the retrieval performance is very low when compared to a parallel-beam geometry. This is considered to be due to the matrix complexity of the projection geometry. We also evaluate the image retrieval performance of the TwIST algorithm -sing measured projection data.
https://doi.org/10.4218/etrij.15.0115.0401 인용 PDF KSCI

Pertussis Toxin Inhibits Colchicine-Induced DNA Synthesis in Human Fibroblast

Jang, Won-Hee;Rhee, In-Ja
- Archives of Pharmacal Research
- /
- v.17 no.3
- /
- pp.199-203
- /
- 1994
Several lines evidence indicate that microtubule depolymerization initiates DNA synthesis or enhances the effects of serum or purified growth factors in many types of fibroblasts. Yet little is known about the intracellular events responsible for the mitogenic effect of microtubule disrupting agents. The effects of antitubulin agents on DNA synthesis in sparse and dense cultures in the presence or absence of serum and possible involvement of G-proteins in their mitotic action were examined. In these studies, colchicine by itself appeared to be mitogenic only for confluent quiesecent human lung fibroblasts. In sparse culture, however, colchicine inhibited serum-stimulated DNA synthesis. Colcemid, another antitubulin agent, showed similar effects of growth inhibition and stimulation in sparse and confluent cultures while lumicolhicine, inactive colchicine, did not. The mitogenic effect of two antitubulin agents, colchicine and colcemid, was partially inhibited by pertussis toxin. These data suggest that microtubular integrity is associated with the expression of either negative or positive control on DNA synthesis and mitogenic effect of antitubulin agents may be partially mediated by pertussis toxin-sensitive G protein.
PDF

A Sparse Data Preprocessing Using Support Vector Regression (Support Vector Regression을 이용한 희소 데이터의 전처리)

Jun, Sung-Hae;Park, Jung-Eun;Oh, Kyung-Whan
- Journal of the Korean Institute of Intelligent Systems
- /
- v.14 no.6
- /
- pp.789-792
- /
- 2004
In various fields as web mining, bioinformatics, statistical data analysis, and so forth, very diversely missing values are found. These values make training data to be sparse. Largely, the missing values are replaced by predicted values using mean and mode. We can used the advanced missing value imputation methods as conditional mean, tree method, and Markov Chain Monte Carlo algorithm. But general imputation models have the property that their predictive accuracy is decreased according to increase the ratio of missing in training data. Moreover the number of available imputations is limited by increasing missing ratio. To settle this problem, we proposed statistical learning theory to preprocess for missing values. Our statistical learning theory is the support vector regression by Vapnik. The proposed method can be applied to sparsely training data. We verified the performance of our model using the data sets from UCI machine learning repository.
https://doi.org/10.5391/JKIIS.2004.14.6.789 인용 PDF KSCI

Data structures and sparse matrix techniques for the implementation of projector-based optimization algorithms

Kim, Sang-Ha
- Proceedings of the Korean Operations and Management Science Society Conference
- /
- 1991.10a
- /
- pp.23-23
- /
- 1991
PDF

Feature selection for text data via sparse principal component analysis (희소주성분분석을 이용한 텍스트데이터의 단어선택)

Won Son
- The Korean Journal of Applied Statistics
- /
- v.36 no.6
- /
- pp.501-514
- /
- 2023
When analyzing high dimensional data such as text data, if we input all the variables as explanatory variables, statistical learning procedures may suffer from over-fitting problems. Furthermore, computational efficiency can deteriorate with a large number of variables. Dimensionality reduction techniques such as feature selection or feature extraction are useful for dealing with these problems. The sparse principal component analysis (SPCA) is one of the regularized least squares methods which employs an elastic net-type objective function. The SPCA can be used to remove insignificant principal components and identify important variables from noisy observations. In this study, we propose a dimension reduction procedure for text data based on the SPCA. Applying the proposed procedure to real data, we find that the reduced feature set maintains sufficient information in text data while the size of the feature set is reduced by removing redundant variables. As a result, the proposed procedure can improve classification accuracy and computational efficiency, especially for some classifiers such as the k-nearest neighbors algorithm.
https://doi.org/10.5351/KJAS.2023.36.6.501 인용 PDF

On Adaptation to Sparse Design in Bivariate Local Linear Regression

Hall, Peter;Seifert, Burkhardt;Turlach, Berwin A.
- Journal of the Korean Statistical Society
- /
- v.30 no.2
- /
- pp.231-246
- /
- 2001
Local linear smoothing enjoys several excellent theoretical and numerical properties, an in a range of applications is the method most frequently chosen for fitting curves to noisy data. Nevertheless, it suffers numerical problems in places where the distribution of design points(often called predictors, or explanatory variables) is spares. In the case of univariate design, several remedies have been proposed for overcoming this problem, of which one involves adding additional ″pseudo″ design points in places where the orignal design points were too widely separated. This approach is particularly well suited to treating sparse bivariate design problem, and in fact attractive, elegant geometric analogues of unvariate imputation and interpolation rules are appropriate for that case. In the present paper we introduce and develop pseudo dta rules for bivariate design, and apply them to real data.
PDF

On Speaker Adaptations with Sparse Training Data for Improved Speaker Verification

Ahn, Sung-Joo;Kang, Sun-Mee;Ko, Han-Seok
- Speech Sciences
- /
- v.7 no.1
- /
- pp.31-37
- /
- 2000
This paper concerns effective speaker adaptation methods to solve the over-training problem in speaker verification, which frequently occurs when modeling a speaker with sparse training data. While various speaker adaptations have already been applied to speech recognition, these methods have not yet been formally considered in speaker verification. This paper proposes speaker adaptation methods using a combination of MAP and MLLR adaptations, which are successfully used in speech recognition, and applies to speaker verification. Experimental results show that the speaker verification system using a weighted MAP and MLLR adaptation outperforms that of the conventional speaker models without adaptation by a factor of up to 5 times. From these results, we show that the speaker adaptation method achieves significantly better performance even when only small training data is available for speaker verification.
PDF

Sparse Kernel Regression using IRWLS Procedure

Park, Hye-Jung
- Journal of the Korean Data and Information Science Society
- /
- v.18 no.3
- /
- pp.735-744
- /
- 2007
Support vector machine(SVM) is capable of providing a more complete description of the linear and nonlinear relationships among random variables. In this paper we propose a sparse kernel regression(SKR) to overcome a weak point of SVM, which is, the steep growth of the number of support vectors with increasing the number of training data. The iterative reweighted least squares(IRWLS) procedure is used to solve the optimal problem of SKR with a Laplacian prior. Furthermore, the generalized cross validation(GCV) function is introduced to select the hyper-parameters which affect the performance of SKR. Experimental results are then presented which illustrate the performance of the proposed procedure.
PDF

Geostatistical Integration of Different Sources of Elevation and its Effect on Landslide Hazard Mapping

Park, No-Wook;Kyriakidis, Phaedon C.
- Korean Journal of Remote Sensing
- /
- v.24 no.5
- /
- pp.453-462
- /
- 2008
The objective of this paper is to compare the prediction performances of different landslide hazard maps based on topographic data stemming from different sources of elevation. The geostatistical framework of kriging, which can properly integrate spatial data with different accuracy, is applied for generating more reliable elevation estimates from both sparse elevation spot heights and exhaustive ASTER-based elevation values. A case study from Boeun, Korea illustrates that the integration of elevation and slope maps derived from different data yielded different prediction performances for landslide hazard mapping. The landslide hazard map constructed by using the elevation and the associated slope maps based on geostatistical integration of spot heights and ASTER-based elevation resulted in the best prediction performance. Landslide hazard mapping using elevation and slope maps derived from the interpolation of only sparse spot heights showed the worst prediction performance.
https://doi.org/10.7780/kjrs.2008.24.5.453 인용 PDF KSCI

Search Result 408, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)