Search | Korea Science

Kang, Hyunseok;Kim, Honggie
- The Korean Journal of Applied Statistics
- /
- v.34 no.6
- /
- pp.889-904
- /
- 2021
This study is a follow-up study of Kang and Kim (2020). In this study, we derive the sample influence functions of the t-statistic which were not directly derived in previous researches. Throughout these results, we both mathematically examine the relationship between the empirical influence function and the sample influence function, and consider a method to approximate the sample influence function by the empirical influence function. Also, the validity of the relationship between an approximated sample influence function and the empirical influence function is verified by a simulation of a random sample of size 300 from normal distribution. As a result of the simulation, the relationship between the sample influence function which is derived from the t-statistic and the empirical influence function, and the method of approximating the sample influence function through the empirical influence function were verified. This research has significance in proposing both a method which reduces errors in approximation of the empirical influence function and an effective and practical method that evolves from previous research which approximates the sample influence function directly through the empirical influence function by constant revision.
https://doi.org/10.5351/KJAS.2021.34.6.889 인용 PDF KSCI

Kim, Byeong-Su;Lee, Seon-Ho;Kim, In-Yeong;Kim, Sang-Cheol;Ra, Seon-Yeong;Jeong, Hyeon-Cheol
- Proceedings of the Korean Statistical Society Conference
- /
- 2003.05a
- /
- pp.295-297
- /
- 2003
본 논의에서는 cDNA 마이크로어레이 분석에서 다변량 분석의 한 방법인 Hotelling의 T제곱 통계량을 이용하여 유의적 유전자군을 검색하고, 이 유전자군을 사용하여 검사자료를 두군으로 분류하는데 단변량 t통계량에 기초한 접근보다 얼마나 효율적인지를 평가하고자 한다.
PDF

Yang Kyung-Sook;Kim HeeYoung
- The Korean Journal of Applied Statistics
- /
- v.18 no.3
- /
- pp.583-595
- /
- 2005
Contingency tables are used to compare counts of n-grams to determine if the n-gram is a true collocation, meaning that the words that make up the n-gram are highly associated in the text. Some statistical methods for identifying collocation are used. They are Kulczinsky coefficient, Ochiai coefficient, Frager and McGowan coefficient, Yule coefficient, mutual information, and chi-square, and so on. But the main problem is that these measures are based ell the assumption of a nor-mal or approximately normal distribution of the variables being sampled. While this assumption is valid in most instances, it is not valid when comparing the rates of occurrence of rare events, and texts are composed mostly of rare events. In this paper we have simply reviewed some statistics about testing association of two words. Some randomization tests to evaluate the significance level in analyzing collocation in large corpora are proposed. A related graph can be used to compare different lest statistics that ran be used to analyze the same contingency table.
https://doi.org/10.5351/KJAS.2005.18.3.583 인용 PDF KSCI

Choi, Soo-Ho
- The Journal of the Korea Contents Association
- /
- v.18 no.11
- /
- pp.285-297
- /
- 2018
The purpose of this study is to examine the trends of import and export of Korea by each continent and to find ways to increase export to Korea in the future. Each continent selected Asia, Europe, North America, Central and South America, and the Middle East. The analysis period was 220 months from January 2000 to April 2018, and data were collected from the KCS. Regression analysis showed that the coefficient was higher in Asia, Europe, North America, Middle East and Latin America. The markets of each continent moved independently of each other and were statistically significant at t statistic and p-value(${\leq}0.01$). As a result of this study, Asia and North America have been major export markets in Korea. Europe, the Middle East and Central and South America are emerging as new markets in Korea. In order to increase Korea's exports in the future, there is a need for continued interest in Asian markets including China & Southeast Asia.
https://doi.org/10.5392/JKCA.2018.18.11.285 인용 PDF KSCI HTML

Kim, Dong-Il;Park, Cheong-Sool;Baek, Jun-Geol;Kim, Sung-Shick
- Journal of the Korea Society for Simulation
- /
- v.18 no.4
- /
- pp.137-148
- /
- 2009
The purpose of this study is to implement variable selection algorithm which helps construct a reliable linear regression model. If we use all candidate variables to construct a linear regression model, the significance of the model will be decreased and it will cause 'Curse of Dimensionality'. And if the number of data is less than the number of variables (dimension), we cannot construct the regression model. Due to these problems, we consider the variable selection problem as a combinatorial optimization problem, and apply GA (Genetic Algorithm) to the problem. Typical measures of estimating statistical significance are $R^2$, F-value of regression model, t-value of regression coefficients, and standard error of estimates. We design GA to solve multi-objective functions, because statistical significance of model is not to be estimated by a single measure. We perform experiments using simulation data, designed to consider various kinds of situations. As a result, it shows better performance than LARS (Least Angle Regression) which is an algorithm to solve variable selection problems. We modify algorithm to solve portfolio selection problem which construct portfolio by selecting stocks. We conclude that the algorithm is able to solve real problems.
https://doi.org/10.9709/JKSS.2009.18.4.137 인용 PDF