snapatac2.tl.macs3#

snapatac2.tl.macs3(adata, *, groupby=None, qvalue=0.05, call_broad_peaks=False, broad_cutoff=0.1, replicate=None, replicate_qvalue=None, max_frag_size=None, selections=None, nolambda=False, shift=-100, extsize=200, min_len=None, blacklist=None, key_added='macs3', tempdir=None, inplace=True, n_jobs=8)[source]#

Call open chromatin peaks with MACS3.

Use this function to call peaks for all cells, for each cell group, or for reproducible group-by-replicate pseudobulk profiles.

Anti-Patterns#

  • Do NOT set replicate without groupby; reproducible peak calling is defined within groups.

  • Do NOT use call_broad_peaks=True only to relax peak stringency; instead, raise qvalue when broad/nested peak structure is not needed.

  • Do NOT pass a blacklist in a non-BED coordinate system; it must match the genome used to generate fragments.

param adata:

Annotated fragment/count object with reference sequence metadata in adata.uns["reference_sequences"].

type adata:

AnnData | AnnDataSet

param groupby:

Grouping key in adata.obs, one group label per cell, or None to call peaks on all cells together.

type groupby:

str | list[str] | None

param qvalue:

MACS3 q-value cutoff for peak calling.

type qvalue:

float

param call_broad_peaks:

If True, call broad peaks. The broad peak calling process utilizes two distinct cutoffs to discern broader, weaker peaks (broad_cutoff) and narrower, stronger peaks (qvalue), which are subsequently nested to provide a nested peak landscape.

type call_broad_peaks:

bool

param broad_cutoff:

MACS3 q-value cutoff for broad peaks.

type broad_cutoff:

float

param replicate:

Replicate key in adata.obs, one replicate label per cell, or None.

type replicate:

str | list[str] | None

param replicate_qvalue:

MACS3 q-value cutoff for replicate-level calls. If None, reuse qvalue.

type replicate_qvalue:

float | None

param max_frag_size:

Maximum fragment size retained for peak calling. If None, use all fragments.

type max_frag_size:

int | None

param selections:

Subset of group names to call. Ignored when groupby is None.

type selections:

set[str] | None

param nolambda:

If True, disable MACS3 local lambda bias correction.

type nolambda:

bool

param shift:

MACS3 shift size.

type shift:

int

param extsize:

MACS3 extension size.

type extsize:

int

param min_len:

Minimum peak length. If None, use extsize.

type min_len:

int | None

param blacklist:

BED file of regions to remove from called peaks.

type blacklist:

Path | None

param key_added:

Key prefix in adata.uns used to store peak tables.

type key_added:

str

param tempdir:

Directory in which to create temporary files. If None, use the system temporary directory.

type tempdir:

Path | None

param inplace:

If True, store peak tables in adata.uns; if False, return them.

type inplace:

bool

param n_jobs:

Number of worker processes for grouped peak calling.

type n_jobs:

int

returns:

If inplace=True, stores peak tables in adata.uns[key_added] for grouped calls or adata.uns[key_added + "_pseudobulk"] for bulk calls, then returns None. If inplace=False, returns peak tables keyed by group name, or a single table for bulk mode.

rtype:

dict[str, ‘polars.DataFrame’] | None

See also

merge_peaks

Examples

>>> import snapatac2 as snap
>>> adata = snap.datasets.pbmc5k(type="annotated_h5ad")
>>> peaks = snap.tl.macs3(adata, groupby="cell_type", inplace=False, n_jobs=1)
>>> isinstance(peaks, dict)
True