snapatac2.metrics.frag_size_distr#
- snapatac2.metrics.frag_size_distr(adata, *, max_recorded_size=1000, add_key='frag_size_distr', inplace=True, n_jobs=8)[source]#
Compute the dataset-level fragment size distribution.
Run this metric after
import_fragmentshas attached fragment metadata to the AnnData object. The result is a vector where indexicounts fragments of lengthi, except index 0 counts fragments longer thanmax_recorded_size. This metric summarizes the whole dataset rather than individual cells.Anti-Patterns#
Do NOT interpret the returned vector as cell-level values; it is one distribution per AnnData object.
Do NOT call this function on an AnnData object that lacks imported fragments.
- param adata:
AnnData object, or a list of AnnData objects, with imported fragments. When a list is provided, compute one distribution for each object in parallel.
- type adata:
AnnData|list[AnnData]- param max_recorded_size:
Largest fragment length with its own output bin. Fragments longer than this value are counted at index 0.
- type max_recorded_size:
- param add_key:
Key used to store the distribution in
adata.unswheninplace=True.- type add_key:
- param inplace:
If True, store the distribution in
adata.uns[add_key]. If False, return the distribution.- type inplace:
- param n_jobs:
Number of jobs to run when
adatais a list. Ifn_jobs=-1, use all available CPUs.- type n_jobs:
- returns:
If
inplace=True, returns None after storing the distribution inadata.uns[add_key]. Ifinplace=False, returns the distribution, or a list of distributions whenadatais a list.- rtype:
Examples
>>> import snapatac2 as snap >>> data = snap.pp.import_fragments( ... snap.datasets.pbmc500(downsample=True), ... chrom_sizes=snap.genome.hg38, ... sorted_by_barcode=False, ... ) >>> snap.metrics.frag_size_distr(data) >>> data.uns["frag_size_distr"].shape[0] 1001