Though some derivatives of the MNN method have attempted to improve memory efficiency by performing dimension reduction in the gene expression space, memory usage is still demanding when the number of single cells is large. However, this approach requires large runtime memory and long computation time to search for MNNs in the high dimensional space of gene expressions. One of the most commonly used approaches is mutual nearest neighbors (MNNs), which employs paired cells (or MNNs) to project the data onto a shared subspace. Most of the existing scRNA-seq integration methods require explicit batch removal steps. Moreover, as the scale of the datasets increases, integrating multiple large-scale scRNA-seq datasets can introduce heavy, or sometimes unbearable, computational and memory storage burden. Most existing batch effect removal procedures assume that the biological effect is orthogonal to the batch effect, which is unlikely to be true in real life. Batch effect removal has thus become a common practice prior to data integration, which introduces additional computational challenges. Batch effect is therefore likely to confound with true biological signals, resulting in the misclassification of cells by experiment rather than by their true biological identities. Batch effect is the perturbation in measured gene expressions, often introduced by factors such as library preparation, sequencing technologies, and sample origins (donors). Batch effect correction is one of the biggest challenges when integrating multiple scRNA-seq datasets. Integration of multiple scRNA-seq datasets from different studies has the great potential to facilitate the identification of both common and rare cell types. The rapid advancement of transcriptome sequencing technologies in single cells (scRNA-seq) has witnessed the exponential growth in the number of large-scale scRNA-seq datasets.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |