scBulkDE performs differential expression testing on pseudobulked single-cell data. It aggregates cells into pseudobulk samples, infers a full-rank design matrix and performs differential gene expression analysis while accounting for categorical and continuous covariates. Currently PyDeseq2 and ANOVA backends are supported for DE testing.
- Pseudobulk aggregation with quality control
- Covariate-aware design — categorical and continuous covariates with automated resolution of confounding factors
- Multiple DE engines — ANOVA F-test and PyDESeq2
- Fallback strategies — pseudoreplicate generation or single-cell testing when biological replicates are insufficient
- Scanpy drop-in —
tl.rank_genes_groupsstores results inadata.unsfor seamless integration with scanpy.
You need to have Python 3.11 or newer installed on your system.
Install from PyPI:
pip install scbulkdeOr install the latest development version:
pip install git+https://github.com/quadbio/scBulkDE.git@mainimport scbulkde as scb
# One-step: pseudobulk + DE
de_result = scb.tl.de(
adata,
group_key="cell_type",
query="B cells",
reference="rest",
replicate_key="donor",
engine="anova",
)
de_result.results.head()Or separate pseudobulking from testing for more control:
# Step 1 — Pseudobulk
pb_result = scb.pp.pseudobulk(
adata,
group_key="cell_type",
query="B cells",
reference="rest",
replicate_key="donor",
)
# Step 2 — DE
de_result = scb.tl.de(pb_result, engine="anova")import scanpy as sc
import scbulkde as scb
scb.tl.rank_genes_groups(
adata,
groupby="cell_type",
de_kwargs=dict(replicate_key="donor", engine="anova"),
)
sc.pl.rank_genes_groups(adata, n_genes=20)Please refer to the documentation, in particular the API documentation.
See the changelog.
For questions and help requests, you can reach out in the scverse discourse. If you found a bug, please use the issue tracker.
t.b.a