snapatac2.pp.filter_cells#
- snapatac2.pp.filter_cells(data, min_counts=1000, min_tsse=5.0, max_counts=None, max_tsse=None, inplace=True, n_jobs=8)[source]#
Filter cells by fragment-count and TSS-enrichment QC thresholds.
Use this function after computing per-cell QC metrics to remove unreliable observations. By default, cells must have at least 1000 fragments and a TSS enrichment score of at least 5.0.
Anti-Patterns#
Do NOT call this function before
data.obs["n_fragment"]and, when TSS filtering is enabled,data.obs["tsse"]are available.Do NOT leave
min_tsseenabled when TSS enrichment was not computed; passmin_tsse=Noneto filter only by fragment counts.Do NOT expect a return value when
inplace=True; the object is subset in place and the function returnsNone.
- type data:
AnnData|list[AnnData]- param data:
AnnData object, or list of AnnData objects, to filter.
- type min_counts:
- param min_counts:
Minimum
data.obs["n_fragment"]value required for a cell to pass filtering. UseNoneto disable the lower fragment-count bound.- type min_tsse:
- param min_tsse:
Minimum
data.obs["tsse"]value required for a cell to pass filtering. UseNoneto disable the lower TSS-enrichment bound.- type max_counts:
- param max_counts:
Maximum
data.obs["n_fragment"]value allowed for a cell to pass filtering. UseNoneto disable the upper fragment-count bound.- type max_tsse:
- param max_tsse:
Maximum
data.obs["tsse"]value allowed for a cell to pass filtering. UseNoneto disable the upper TSS-enrichment bound.- type inplace:
- param inplace:
If
True, subsetdatain place and returnNone. IfFalse, return integer indices of cells passing all enabled thresholds.- type n_jobs:
- param n_jobs:
Number of parallel jobs to use when
datais a list.- returns:
If
inplace=False, returns integer indices of cells that pass all enabled thresholds. Ifdatais a list, returns one index array per object. Ifinplace=True, returnsNoneand subsetsdatain place.- rtype:
See also
call_cellsCall cell-containing barcodes from count distributions.
Examples
>>> import snapatac2 as snap >>> fragments = snap.datasets.pbmc500(downsample=True) >>> data = snap.pp.import_fragments( ... fragments, ... chrom_sizes=snap.genome.hg38, ... sorted_by_barcode=False, ... ) >>> snap.metrics.tsse(data, snap.genome.hg38) >>> selected = snap.pp.filter_cells( ... data, ... min_counts=1000, ... min_tsse=5.0, ... inplace=False, ... ) >>> data = data[selected, :]