References
- Bak, J. Y. and Oh, A. (2015). Five centuries of monarchy in Korea: mining the text of the annals of the Joseon dynasty. In Proceedings of the 9th SIGHUM workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities at the 53rd Annual Meeting of the Association for Computational Linguistics, 10-14.
- Clauset, A., Shalizi, C. R., and Newman, M. E. J. (2009). Power-law distributions in empirical data. SIAM Review, 51, 661-703. https://doi.org/10.1137/070710111
- Clauset, A. and Woodard, R. (2013). Estimating the historical and future probabilities of large terrorist events. Annals of Applied Statistics, 7, 1838-1865. https://doi.org/10.1214/12-AOAS614
- Fan, J., Han, F. and Liu, H. (2014). Challenges of big data analysis. National Science Review, 1, 293-314. https://doi.org/10.1093/nsr/nwt032
- Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96 1348-1360. https://doi.org/10.1198/016214501753382273
- Fan, J. and Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space (with discussion). Journal of the Royal Statistical Society Series B, 70, 849-911. https://doi.org/10.1111/j.1467-9868.2008.00674.x
- Franke, B., Plante, J.-F., Roscher, R., Lee, E.-S. A., Smyth, C., Hatefi, A., Chen, F., Gil, E., Schwing, A., Selvitella, A., Hoffman, M. M., Grosse, R., Hendricks, D., and Reid, N. (2016). Statistical inference, learning and models in big data. International Statistical Review, To appear.
- Kim, D., Son, S.-W., and Jeong, H. (2014). Large-scale quantitative analysis of painting arts. Scientific Reports, 4, 7370. https://doi.org/10.1038/srep07370
- Ko, S. and Won, J.-H. (2016). Processing large-scale data with Apache Spark, The Korean Journal of Applied Statistics, To appear.
- Kulldorff, M. (1997). A spatial scan statistic. Communications in Statistics: Theory and Methods, 26, 1481-1496. https://doi.org/10.1080/03610929708831995
- Lee, B., Kim, D., Kim, D., and Jeong, H. (2016). N-gram web service and stylometric analysis of Korean historical documents. New Physics: Sae Mulli, 66, 502-510. https://doi.org/10.3938/NPSM.66.502
- Park, J.-H., Lee, S.-Y., Kang, D. H. and Won, J.-H. (2013). Hadoop and Mapreduce. Journal of the Korean Data & Information Science Society, 24, 1013-1027. https://doi.org/10.7465/jkdi.2013.24.5.1013
- Sawchik, T. (2015). Big Data Baseball: Math, Miracles, and the End of a 20 Year Losing Streak, Flatiron Books.
- Sievert, C. (2015). Tools for harnessing 'MLBAM', 'Gameday' data and visualizing 'pictchfx', http://cpsievert.github.com/pitchRx
- Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58, 267-288.
- van de Geer, S. Buhlmann, P. Ritov, Y. A., and Dezeure, R. (2014). On asymptotically optimal confidence regions and test for high-dimensional models. Annals of Statistics, 42, 1166-1202. https://doi.org/10.1214/14-AOS1221
- Witten, R. and Candes, E. (2013). Randomized algorithms for low-rank matrix factorizations: sharp performance bounds. Algorithmica, 63, 355-363.
- Zhang, C.-H. (2010.) Nearly unbiased variable selection under minimax concave penalty. Annals of Statistics, 38, 894-942. https://doi.org/10.1214/09-AOS729