Search | Korea Science

Kim, Sung-Chul;Park, Ji-Yeon
- The Korean Journal of Applied Statistics
- /
- v.22 no.3
- /
- pp.443-461
- /
- 2009
In order to overcome the lack of Korean credit rating migration data, we consider an empirical Bayes procedure to estimate credit rating migration matrices. We derive the posterior probabilities of Korean credit rating transitions by utilizing the Moody's rating migration data and the credit rating assignments from Korean rating agency as prior information and likelihood, respectively. Metrics based upon the average transition probability are developed to characterize the migration matrices and compare our Bayesian migration matrices with some given matrices. Time series data for the metrics show that our Bayesian matrices are stable, while the matrices based on Korean data have large variation in time. The bootstrap tests demonstrate that the results from the three estimation methods are significantly different and the Bayesian matrices are more affected by Korean data than the Moody's data. Finally, Monte Carlo simulations for computing the values of a portfolio and its credit VaRs are performed to compare these migration matrices.
https://doi.org/10.5351/KJAS.2009.22.3.443 인용 PDF KSCI

Kim, Joungyoun;Park, Min-Jeong
- The Korean Journal of Applied Statistics
- /
- v.32 no.1
- /
- pp.83-97
- /
- 2019
As society develops, the dissemination of microdata has increased to respond to diverse analytical needs of users. Analysis of microdata for policy making, academic purposes, etc. is highly desirable in terms of value creation. However, the provision of microdata, whose usefulness is guaranteed, has a risk of exposure of personal information. Several methods have been considered to ensure the protection of personal information while ensuring the usefulness of the data. One of these methods has been studied to generate and utilize synthetic data. This paper aims to understand the synthetic data by exploring methodologies and precautions related to synthetic data. To this end, we first explain muptiple imputation, Bayesian predictive model, and Bayesian bootstrap, which are basic foundations for synthetic data. And then, we link these concepts to the construction of fully/partially synthetic data. To understand the creation of synthetic data, we review a real longitudinal synthetic data example which is based on sequential regression multivariate imputation.
https://doi.org/10.5351/KJAS.2019.32.1.083 인용 PDF KSCI HTML

김동욱;노영화
- The Korean Journal of Applied Statistics
- /
- v.16 no.2
- /
- pp.407-426
- /
- 2003
We consider the missing covariates problem in generalized estimating equations(GEE) model. If the covariate is partially missing, GEE can not be calculated. In this paper, we study the performance of 7 imputation methods to handle missing covariates in GEE models, and the properties of GEE estimators are investigated after missing covariates are imputed for ordinal data of repeated measurements. The 7 imputation methods include i) Naive Deletion ii) Sample Average Imputation iii) Row Average Imputation iv) Cross-wave Regression Imputation v) Carry-over Imputation vi) Bayesian Bootstrap vii) Approximate Bayesian Bootstrap. A Monte-Carlo simulation is used to compare the performance of these methods. For the missing mechanism generating the missing data, we assume ignorable nonresponse. Furthermore, we generate missing covariates with or without considering wave nonresp onse patterns.
https://doi.org/10.5351/KJAS.2003.16.2.407 인용 PDF KSCI

Choe, Jun-Hyeok;Jeon, Seong-Hae;Lee, Jeong-Hyeon
- The Transactions of the Korea Information Processing Society
- /
- v.7 no.7
- /
- pp.2108-2115
- /
- 2000
The conventional Boolean retrieval systems based on vector spae model can provide the results of retrieval fast, they can't reflect exactly user's retrieval purpose including semantic information. Consequently, the results of retrieval process are very different from those users expected. This fact forces users to waste much time for finding expected documents among retrieved documents. In his paper, we designed a bayesian SOM(Self-Organizing feature Maps) in combination with bayesian statistical method and Kohonen network as a kind of unsupervised learning, then perform classifying documents depending on the semantic similarity to user query in real time. If it is difficult to observe statistical characteristics as there are less than 30 documents for clustering, the number of documents must be increased to at least 50. Also, to give high rank to the documents which is most similar to user query semantically among generalized classifications for generalized clusters, we find the similarity by means of Kohonen centroid of each document classification and adjust the secondary rank depending on the similarity.
PDF