snapatac2.tl.motif_enrichment#

snapatac2.tl.motif_enrichment(motifs, regions, genome_fasta, background=None, method=None)[source]#

Test transcription factor motifs for enrichment in region sets.

Use this function to compare motif occurrence in foreground region groups against either an explicit background or the union of all foreground regions.

Anti-Patterns#

  • Do NOT use method="hypergeometric" with foreground regions that are not contained in background.

  • Do NOT pass region strings from a genome build different from genome_fasta.

param motifs:

Motifs to scan in foreground and background sequences.

type motifs:

list[PyDNAMotif]

param regions:

Foreground genomic regions keyed by group name. Region strings must use chrom:start-end coordinates.

type regions:

dict[str, list[str]]

param genome_fasta:

Genome FASTA path, or a Genome object containing a FASTA path.

type genome_fasta:

Path | Genome

param background:

Background regions. If None, use the union of all foreground regions.

type background:

list[str] | None

param method:

Statistical test. If None, use "hypergeometric" when background is None and "binomial" otherwise.

type method:

Literal[‘binomial’, ‘hypergeometric’] | None

returns:

Enrichment tables keyed by group name. Each table contains motif id, name, family, log2 fold change, p-value, and adjusted p-value.

rtype:

dict[str, ‘polars.DataFrame’]

Examples

>>> import snapatac2 as snap
>>> motifs = snap.datasets.cis_bp(unique=True)
>>> regions = {"set1": ["chr1:10000-10200", "chr1:20000-20200"]}
>>> result = snap.tl.motif_enrichment(motifs[:2], regions, snap.genome.hg38)
>>> list(result)
['set1']