snapatac2.tl.aggregate_X#
- snapatac2.tl.aggregate_X(adata, groupby=None, normalize=None, file=None)[source]#
Aggregate
.Xvalues across cells or cell groups.Use this function to create pseudobulk count profiles, optionally normalized as RPM or RPKM, from all cells or from groups defined by
groupby.Anti-Patterns#
Do NOT use
normalize="RPKM"unlessadata.var_namesare genomic regions inchrom:start-endformat.Do NOT expect a raw NumPy array return; the function always returns an AnnData object containing the aggregated matrix.
- param adata:
Annotated data object with cells in observations and features in variables.
- type adata:
AnnData|AnnDataSet- param groupby:
Grouping key in
adata.obs, one group label per cell, or None to aggregate all cells together.- type groupby:
- param normalize:
Optional normalization applied to each aggregated profile.
- type normalize:
- param file:
Output h5ad path for backed results. If None, return an in-memory AnnData.
- type file:
- returns:
AnnData object with aggregated profiles in
.X, original feature names in.var_names, and group names in.obs_nameswhengroupbyis set.- rtype:
AnnData
Examples
>>> import snapatac2 as snap >>> adata = snap.datasets.pbmc5k(type="annotated_h5ad") >>> pseudobulk = snap.tl.aggregate_X(adata, groupby="cell_type", normalize="RPM") >>> pseudobulk.n_vars == adata.n_vars True