Loading Events

Enhancing Statistical Rigor in Single-Cell and Spatial Omics Data Analysis Using Synthetic Negative Controls

The rapid development of single-cell and spatial omics technologies has propelled fast advances in computational algorithms. However, the statistical rigor in data analysis has been often overlooked. Motivated by the […]

Mar 25

March 25, 2024

12:00 pm - 12:00 pm

  • None

The rapid development of single-cell and spatial omics technologies has propelled fast advances in computational algorithms. However, the statistical rigor in data analysis has been often overlooked. Motivated by the mandatory use of negative controls in experiments, I propose to enhance the reliability of single-cell and spatial omics data analysis by using synthetic negative controls generated based on real data under specific null hypotheses. I will demonstrate this strategy using two statistical methods my group developed. First, using permutation to generate a synthetic negative control in which cell-cell relationships are disrupted, we developed a statistical method, scDEED (https://doi.org/10.1101/2023.04.21.537839), to detect dubious two-dimensional cell embeddings, crucial for single-cell data visualization, and to optimize the hyperparameters of embedding methods such as t-SNE and UMAP. Second, using our simulator scDesign3 (https://www.nature.com/articles/s41587-023-01772-1) to generate synthetic null controls, we developed a statistical method, ClusterDE (https://doi.org/10.1101/2023.07.21.550107), to identify potential cell-type markers (or spatial domain markers) from differential expression (DE) analysis applied to potential cell types (or spatial domains) identified through clustering analysis. Overall, leveraging synthetic negative controls is an effective strategy to increase the statistical rigor of complex data analysis and thus improve the reliability of analysis results.