Browse > Article
http://dx.doi.org/10.5351/KJAS.2020.33.4.511

One-step spectral clustering of weighted variables on single-cell RNA-sequencing data  

Park, Min Young (Department of Statistics, Sungkyunkwan University)
Park, Seyoung (Department of Statistics, Sungkyunkwan University)
Publication Information
The Korean Journal of Applied Statistics / v.33, no.4, 2020 , pp. 511-526 More about this Journal
Abstract
Single-cell RNA-sequencing (scRNA-seq) data consists of each cell's RNA expression extracted from large populations of cells. One main purpose of using scRNA-seq data is to identify inter-cellular heterogeneity. However, scRNA-seq data pose statistical challenges when applying traditional clustering methods because they have many missing values and high level of noise due to technical and sampling issues. In this paper, motivated by analyzing scRNA-seq data, we propose a novel spectral-based clustering method by imposing different weights on genes when computing a similarity between cells. Assigning weights on genes and clustering cells are performed simultaneously in the proposed clustering framework. We solve the proposed non-convex optimization using an iterative algorithm. Both real data application and simulation study suggest that the proposed clustering method better identifies underlying clusters compared with existing clustering methods.
Keywords
single-cell RNA-sequencing data; spectral clustering; variable weight; optimization;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Beck, A. and Tetruashvili, L. (2013). On the convergence of block coordinate descent type methods, SIAM Journal on Optimization, 23, 2037-2060.   DOI
2 Buettner, F., Natarajan, K. N., Casale, F. P., Proserpio, V., Scialdone, A., Theis, F. J., Teichmann, S. A., Marioni, J. C., and Stegle, O. (2015). Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells, Nature Biotechnology, 33, 155-160.   DOI
3 Deng, Q., Ramskold, D., Reinius, B., and Sandberg, R. (2014). Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells, Science, 343, 193-196.   DOI
4 Haque, A., Engel, J., Teichmann, S. A., and Lonnberg, T. (2017). A practical guide to single-cell RNAsequencing for biomedical research and clinical applications, Genome Medicine, 9, 75.   DOI
5 Kalisky, T. and Quake, S. R. (2011). Single-cell genomics, Nature Methods, 8, 311-314.   DOI
6 Kvalseth, T. O. (1987). Entropy and correlation: some comments, IEEE Transactions on Systems, Man, and Cybernetics, 17, 517-519.   DOI
7 Ng, A. Y., Jordan, M. I., and Weiss, Y. (2002). On spectral clustering: analysis and an algorithm. In Advances in Neural Information Processing Systems, 849-856.
8 Park, S., Xu, H., and Zhao, H. (2020). Integrating multidimensional data for clustering analysis with applications to cancer patient data, Journal of the American Statistical Association, In Press.
9 Park, S. and Zhao, H. (2018). Spectral clustering based on learning similarity matrix, Bioinformatics, 34, 2069-2076.   DOI
10 Park, S. and Zhao, H. (2019). Sparse principal component analysis with missing observations, Annals of Applied Statistics, 13, 1016-1042.   DOI
11 Pollen, A. A., Nowakowski, T. J., Shuga, J., Wang, X., Leyrat, A. A., Lui, J. H., Li, N., Szpankowski, L., Fowler, B., Chen, P., and Ramalingam, N. (2014). Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex, Nature Biotechnology, 32, 1053-1058.   DOI
12 Saha, A. and Tewari, A. (2013). On the nonasymptotic convergence of cyclic coordinate descent methods, SIAM Journal on Optimization, 23, 576-601.   DOI
13 Satija, R., Farrell, J. A., Gennert, D., Schier, A. F., and Regev, A. (2015). Spatial reconstruction of single-cell gene expression data, Nature Biotechnology, 33, 495-502.   DOI
14 Schlitzer, A., Sivakamasundari, V., Chen, J., Sumatoh, H. R. B., Schreuder, J., Lum, J., Malleret, B., Zhang, S., Larbi, A., Zolezzi, F., and Renia, L. (2015). Identification of cDC1-and cDC2-committed DC progenitors reveals early lineage priming at the common DC progenitor stage in the bone marrow, Nature Immunology, 16, 718-728.   DOI
15 Shapiro, E., Biezuner, T., and Linnarsson, S. (2013). Single-cell sequencing-based technologies will revolutionize whole-organism science, Nature Reviews Genetics, 14, 618-630.   DOI
16 Stegle, O., Teichmann, S. A., and Marioni, J. C. (2015). Computational and analytical challenges in singlecell transcriptomics, Nature Reviews Genetics, 16, 133-145.   DOI
17 Ting, D. T., Wittner, B. S., Ligorio, M., Jordan, N. V., Shah, A. M., Miyamoto, D. T., Aceto, N., Bersani, F., Brannigan, B. W., Xega, K., and Ciciliano, J. C. (2014). Single-cell RNA sequencing identifies extracellular matrix gene expression by pancreatic circulating tumor cells, Cell Reports, 8, 1905-1918.   DOI
18 Treutlein, B., Brownfield, D. G., Wu, A. R., Neff, N. F., Mantalas, G. L., Espinoza, F. H., Desai, T. J., Krasnow, M. A. and Quake, S. R., (2014). Reconstructing lineage hierarchies of the distal lung epithelium using single-cell RNA-seq, Nature, 509, 371-375.   DOI
19 von Luxburg, U. (2007). A tutorial on spectral clustering, Statistics and Computing, 17, 395-416.   DOI
20 Wang, B., Zhu, J., Pierson, E., Ramazzotti, D., and Batzoglou, S. (2017). Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning, Nature Methods, 14, 414-416.   DOI
21 Xu, C. and Su, Z. (2015). Identification of cell types from single-cell transcriptomes using a novel clustering method, Bioinformatics, 31, 1974-1980.   DOI
22 Xu, Y. and Yin, W. (2017). A globally convergent algorithm for nonconvex optimization based on block coordinate update, Journal of Scientic Computing, 72, 700-734.   DOI