snapatac2.tl.aggregate_cells#

snapatac2.tl.aggregate_cells(adata, use_rep='X_spectral', target_num_cells=None, min_cluster_size=50, random_state=0, key_added='pseudo_cell', inplace=True)[source]#

Assign cells to pseudo-cell groups by iterative clustering.

Use this function to coarsen a cell embedding into pseudo-cell labels while preserving local graph structure through repeated Leiden clustering.

Anti-Patterns#

Do NOT pass use_rep as an .obs key; it must name an embedding in adata.obsm.
Do NOT expect exactly target_num_cells groups; iterative splitting stops when clusters cannot be split reliably.

param adata:: Annotated data object containing adata.obsm[use_rep], or a numeric matrix with cells as rows.
type adata:: AnnData | AnnDataSet | ndarray
param use_rep:: Key in adata.obsm containing the input embedding.
type use_rep:: str
param target_num_cells:: Target number of pseudo-cell groups. If None, use adata.n_obs // min_cluster_size.
type target_num_cells:: int | None
param min_cluster_size:: Minimum cluster size used during iterative splitting.
type min_cluster_size:: int
param random_state:: Seed passed to Leiden clustering.
type random_state:: int
param key_added:: Key in adata.obs used to store pseudo-cell labels.
type key_added:: str
param inplace:: If True, store labels in adata.obs[key_added]; if False, return them.
type inplace:: bool
returns:: If inplace=True, stores categorical labels in adata.obs[key_added] and returns None. If inplace=False, returns the labels.
rtype:: ndarray | None

Examples

>>> import numpy as np
>>> import snapatac2 as snap
>>> X = np.random.default_rng(0).normal(size=(100, 5))
>>> labels = snap.tl.aggregate_cells(X, min_cluster_size=10, inplace=False)
>>> labels.shape
(100,)