snapatac2.tl.aggregate_cells#

snapatac2.tl.aggregate_cells(adata, use_rep='X_spectral', target_num_cells=None, min_cluster_size=50, random_state=0, key_added='pseudo_cell', inplace=True)[source]#

Assign cells to pseudo-cell groups by iterative clustering.

Use this function to coarsen a cell embedding into pseudo-cell labels while preserving local graph structure through repeated Leiden clustering.

Anti-Patterns#

  • Do NOT pass use_rep as an .obs key; it must name an embedding in adata.obsm.

  • Do NOT expect exactly target_num_cells groups; iterative splitting stops when clusters cannot be split reliably.

param adata:

Annotated data object containing adata.obsm[use_rep], or a numeric matrix with cells as rows.

type adata:

AnnData | AnnDataSet | ndarray

param use_rep:

Key in adata.obsm containing the input embedding.

type use_rep:

str

param target_num_cells:

Target number of pseudo-cell groups. If None, use adata.n_obs // min_cluster_size.

type target_num_cells:

int | None

param min_cluster_size:

Minimum cluster size used during iterative splitting.

type min_cluster_size:

int

param random_state:

Seed passed to Leiden clustering.

type random_state:

int

param key_added:

Key in adata.obs used to store pseudo-cell labels.

type key_added:

str

param inplace:

If True, store labels in adata.obs[key_added]; if False, return them.

type inplace:

bool

returns:

If inplace=True, stores categorical labels in adata.obs[key_added] and returns None. If inplace=False, returns the labels.

rtype:

ndarray | None

Examples

>>> import numpy as np
>>> import snapatac2 as snap
>>> X = np.random.default_rng(0).normal(size=(100, 5))
>>> labels = snap.tl.aggregate_cells(X, min_cluster_size=10, inplace=False)
>>> labels.shape
(100,)