snapatac2.pp.call_cells#

snapatac2.pp.call_cells(data, use_rep, inplace=True, n_jobs=8)[source]#

Call valid cells from feature counts using the OrdMag algorithm.

Use this function to remove empty or low-signal barcodes after importing fragments and computing a per-barcode count metric. The implementation uses the order-of-magnitude (OrdMag) strategy from Cell Ranger’s cell-calling workflow; EmptyDrops is not implemented.

Anti-Patterns#

  • Do NOT pass a representation that is missing from data.obs when use_rep is a string.

  • Do NOT use this function as a replacement for QC thresholding by TSS enrichment or fragment count; use filter_cells when explicit QC thresholds are required.

  • Do NOT expect a return value when inplace=True; the object is subset in place and the function returns None.

type data:

AnnData | list[AnnData]

param data:

AnnData object, or list of AnnData objects, to subset to called cells.

type use_rep:

str | ndarray[float]

param use_rep:

Count representation used for cell calling. If a string, read counts from data.obs[use_rep]. If an array, use it directly as one count per barcode.

type inplace:

bool

param inplace:

If True, subset data to called cells and return None. If False, return integer indices of called cells without modifying data.

type n_jobs:

int

param n_jobs:

Number of parallel jobs to use when data is a list.

returns:

If inplace=False, returns integer indices of barcodes called as cells. If data is a list, returns one index array per object. If inplace=True, returns None and subsets data in place.

rtype:

ndarray | None

See also

filter_cells

Apply explicit QC thresholds to cells.

Examples

>>> import snapatac2 as snap
>>> fragments = snap.datasets.pbmc500(downsample=True)
>>> data = snap.pp.import_fragments(
...     fragments,
...     chrom_sizes=snap.genome.hg38,
...     sorted_by_barcode=False,
... )
>>> selected = snap.pp.call_cells(data, use_rep="n_fragment", inplace=False)
>>> data = data[selected, :]