Browse > Article
http://dx.doi.org/10.5351/KJAS.2016.29.6.1061

A study on high dimensional large-scale data visualization  

Lee, Eun-Kyung (Department of Statistics, Ewha Womans University)
Hwang, Nayoung (Department of Statistics, Ewha Womans University)
Lee, Yoondong (Sogang Business School, Sogang University)
Publication Information
The Korean Journal of Applied Statistics / v.29, no.6, 2016 , pp. 1061-1075 More about this Journal
Abstract
In this paper, we discuss various methods to visualize high dimensional large-scale data and review some issues associated with visualizing this type of data. High-dimensional data can be presented in a 2-dimensional space with a few selected important variables. We can visualize more variables with various aesthetic attributes in graphics or use the projection pursuit method to find an interesting low-dimensional view. For large-scale data, we discuss jittering and alpha blending methods that solve any problem with overlapping points. We also review the R package tabplot, scagnostics, and other R packages for interactive web application with visualization.
Keywords
high dimensional large-scale data; data visualization; projection pursuit; alpha blending; aesthetic mapping;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Tennekes, M., Gombin, J., Jeworutzki, S., Russell, K., and Zijdeman, R. (2016). tmap: Thematic Maps in R; R package version 1.4-1, URL https://cran.r-project.org/web/packages/tmap/index.html.
2 Vaidyanathan, R., Xie, Y., Allaire, J., Cheng, J., and Russel, K. (2016). htmlwidgets: HTML Widgets for R; R package version 0.7, URL https://cran.r-project.org/web/packages/htmlwidgets/index.html.
3 Wickham, H. (2009). ggplot2: elegant graphics for data analysis, Springer Science and Business Media.
4 Wickham, H. (2010). A layered grammar of graphics. Journal of Computational and Graphical Statistics, 19, 3-28.   DOI
5 Wickham, H., Cook, D., Hofmann, H., and Buja, A. (2011). tourr: An R package for exploring multi-variate data with projections. Journal of Statistical Software, 40, 1-18.
6 Wilkinson, L. (2006). The grammar of graphics, Springer Science and Business Media.
7 Wilkinson, L., Anand, A., and Grossman, R. (2006). High-dimensional visual analytics: Interactive exploration guided by pairwise views of point distributions. IEEE Transactions on Visualization and Computer Graphics, 12, 1363-1372.   DOI
8 Allaire, J., Cheng, J., Xie, Y., McPherson, J., Chang, W., Allen, J., Wickham, H., Atkins, A., and Hyndman, R. (2016a). rmarkrdown: Dynamic Documents for R; R package version 1.0, URL https://cran.rproject.org/web/packages/rmarkdown/index.html
9 Allaire, J., Keen, I., Almsaeed, A., Mosbech, J., Bosssart, N., Verou, L., Baranovskiy, D., Labs, S., Djuricic, B., Sardyha, T., Lewis, B., Sievert, C., Kunst J., Hafen, R., Rudis, B., and Cheng J. (2016b). flexdashboard: R Markdown Format for Flexible Dashboards; R package version 0.3, URL https://cran.r-project.org/web/packages/flexdashboard/index.html.
10 Chang, W., Cheng, J., Allaire, J., Xie, Y., and McPherson, J. (2016). shiny: Web Application Framework for R; R package version 0.13.2, URL https://cran.r-project.org/web/packages/shiny/index.html.
11 Cook, D., Buja, A., Cabrera, J., and Hurley, C. (1995). Grand tour and projection pursuit. Journal of Computational and Graphical Statistics, 4, 155-172.
12 Chang, W. and Wickham, H. (2016). ggvis: Interactive Grammar of Graphics; R package version 0.4.3, URL https://cran.r-project.org/web/packages/ggvis/index.html
13 Cheng, J., Xie, Y., Wickham, H., Agafonkin, V., et al. (2016). leaflet: Create InteractiveWeb Maps with the JavaScript 'Leaflet' Library; R package version 1.0.1, URL https://cran.r-project.org/web/packages/leaflet/index.html
14 Cook, D., Buja, A., and Cabrera, J. (1993). Projection pursuit indexes based on orthogonal function expansions. Journal of Computational and Graphical Statistics, 2, 225-250.   DOI
15 Cook, D. and Swayne, D. F. (2007). Interactive and dynamic graphics for data analysis: with R and GGobi, Springer Science and Business Media.
16 Friedman, J. H. (1987). Exploratory projection pursuit. Journal of the American Statistical Association, 82, 249-266.   DOI
17 Gebhardt, A. (2015). ash: David Scott's ASH Routines; R package version 1.0-15, URL https://cran.rproject.org/web/packages/ash/index.html.
18 Gesmann, M., de Castillo, D., and Gesmann, M. M. (2016). GoogleVis: R Interface to Google Charts; R package version 0.6.0, URL https://cran.r-project.org/web/packages/googleVis/index.html.
19 Graul, C. and Graul, M. C. (2016). leafletR: Interactive Web-Maps Based on the Leaflet JavaScript Library; R package version 0.4-0, URL https://cran.r-project.org/web/packages/leafletR/index.html.
20 Hall, P. (1989). On polynomial-based projection indices for exploratory projection pursuit. The Annals of Statistics, 17, 589-605.   DOI
21 Jones, M. C. and Sibson, R. (1987). What is projection pursuit? (with discussion). Journal of the Royal Statistical Society, Series A, 150, 1-36.   DOI
22 Huber, P. J. (1985). Projection pursuit (with discussion). The Annals of Statistics, 13, 435-525.   DOI
23 Hyndman, R. J. (2016). dygraphs: Interface to 'Dygraphs' Interactive Time Series Charting Library; R package version 1.1.1-1, URL https://cran.r-project.org/web/packages/dygraphs/index.html.
24 Ihaka, R. and Gentleman, R. (1996). R: a language for data analysis and graphics. Journal of Computational and Graphical Statistics, 5, 299-314.
25 Lamstein, A. and Johnson, B. P. C (2016). choroplethr: Simplify the Creation of Choropleth Maps in R; R package version 3.5.2, URL https://cran.r-project.org/web/packages/choroplethr/index.html.
26 Lawrence, M., Wickham, H., Cook, D., Hofmann, H., and Swayne, D. F. (2009). Extending the GGobi pipeline from R. Computational Statistics, 24, 195-205.   DOI
27 Lee, E. K. and Cook, D. (2010). A projection pursuit index for large p small n data. Statistics and Computing, 20, 381-392.   DOI
28 Lee, E. K., Cook, D., Klinke, S., and Lumley, T. (2005). Projection pursuit for exploratory supervised classification. Journal of Computational and Graphical Statistics, 14, 831-846.   DOI
29 Scott, D. W. (1985). Averaged shifted histograms: effective nonparametric density estimators in several dimensions. The Annals of Statistics, 13, 1024-1040.   DOI
30 Sievert, C., Parmer, C., Hocking, T., Chamberlain, S., Ram, K., Corvellec, M., and Despouy, P. (2016). plotly: Create Interactive Web Graphics via 'plotly.js'; R package version 3.6.0, URL https://cran.rproject.org/web/packages/plotly/index.html.
31 Tennekes, M. and de Jonge, E. (2012). tabplot: Tableplot, a Visualization of Large Datasets; R package version 1.3, URL https://cran.r-project.org/web/packages/tabplot/index.html.
32 Swayne, D. F., Lang, D. T., Buja, A., and Cook, D. (2003). GGobi: evolving from XGobi into an extensible framework for interactive data visualization. Computational Statistics and Data Analysis, 43, 423-444.   DOI