Several key steps include:
Data Preprocessing: involves quality control steps such as removing low-quality cells, normalizing data, annotating cell types based on gene expression patterns.
Pseudo-bulk Generation: single cell data from a group of cells from the same condition or individual are aggregated into single expression profile using various methods such as summing or averaging.
Differential Expression Analysis: applies differential analysis methods similar to the traditional RNA-seq analysis such as edgeR or DESeq2 to the Pseudo-bulk samples.
Result Interpretation: differential expression results are then interpreted in the context of the experiment. typically, the results are gene lists that are upregulated or downregulated.
Principal Component Analysis (PCA) is a method used to simplify complex data by finding its most important patterns. It transforms correlated variables into new, uncorrelated ones called “principal components.” These components capture the largest variations in the data. By using PCA, we can reduce the data’s dimensionality while keeping essential information, making it easier to visualize and analyze.
PCA plot between two groups.
Hierarchical clustering is a method to group similar data points together based on their similarities. It creates a tree-like structure (dendrogram) where similar items are joined at different levels. It helps identify clusters and relationships within the data without the need to specify the number of clusters beforehand. Similar to PCA, hierarchical clustering is another, complementary method for identifying strong patterns in a dataset and potential outliers.
This plot is a good check to make sure that we are interpreting our fold change values correctly
Analysis | Software | Version |
---|---|---|
Pseudo-bulk Differential Expression Analysis | DESeq2 | 1.34.0 |
Gene Ontology Analysis | gprofiler2 | 0.2.2 |