snapatac2.metrics.frip#

snapatac2.metrics.frip(adata, regions, *, normalized=True, count_as_insertion=False, inplace=True, n_jobs=8)[source]#

Compute fraction of reads or insertions in selected regions.

Run this metric after import_fragments has attached fragment metadata to the AnnData object. Use the keys of regions as output column names; with inplace=True, each metric is written to adata.obs.

Anti-Patterns#

  • Do NOT call this function on an AnnData object that lacks imported fragments.

  • Do NOT reuse the same regions dictionary across calls if you need to preserve original path values; this function converts path values to region lists in place.

param adata:

AnnData object, or a list of AnnData objects, with imported fragments. When a list is provided, compute FRiP for each object in parallel.

type adata:

AnnData | list[AnnData]

param regions:

Mapping from output metric name to a BED file path or a list of genomic intervals such as "chr1:100-200".

type regions:

dict[str, Path | list[str]]

param normalized:

If True, return fractions normalized by the total number of fragments or insertions. If False, return raw counts overlapping each region set.

type normalized:

bool

param count_as_insertion:

If True, count transposition insertions at fragment ends instead of whole fragments.

type count_as_insertion:

bool

param inplace:

If True, store each result vector in adata.obs using the corresponding regions key. If False, return the result dictionary.

type inplace:

bool

param n_jobs:

Number of jobs to run when adata is a list. If n_jobs=-1, use all available CPUs.

type n_jobs:

int

returns:

If inplace = True, directly adds the results to adata.obs. Otherwise return a dictionary containing the results.

rtype:

dict[str, list[float]] | list[dict[str, list[float]]] | None

Examples

>>> import snapatac2 as snap
>>> data = snap.pp.import_fragments(snap.datasets.pbmc500(downsample=True), chrom_sizes=snap.genome.hg38, sorted_by_barcode=False)
>>> snap.metrics.frip(data, {"peaks_frac": snap.datasets.cre_HEA()})
>>> print(data.obs['peaks_frac'].head())
AAACTGCAGACTCGGA-1    0.715930
AAAGATGCACCTATTT-1    0.697364
AAAGATGCAGATACAA-1    0.713615
AAAGGGCTCGCTCTAC-1    0.678428
AAATGAGAGTCCCGCA-1    0.724910
Name: peaks_frac, dtype: float64