snapatac2.tl.motif_enrichment#
- snapatac2.tl.motif_enrichment(motifs, regions, genome_fasta, background=None, method=None)[source]#
Test transcription factor motifs for enrichment in region sets.
Use this function to compare motif occurrence in foreground region groups against either an explicit background or the union of all foreground regions.
Anti-Patterns#
Do NOT use
method="hypergeometric"with foreground regions that are not contained inbackground.Do NOT pass region strings from a genome build different from
genome_fasta.
- param motifs:
Motifs to scan in foreground and background sequences.
- type motifs:
list[PyDNAMotif]
- param regions:
Foreground genomic regions keyed by group name. Region strings must use
chrom:start-endcoordinates.- type regions:
dict[str, list[str]]
- param genome_fasta:
Genome FASTA path, or a Genome object containing a FASTA path.
- type genome_fasta:
Path | Genome
- param background:
Background regions. If None, use the union of all foreground regions.
- type background:
list[str] | None
- param method:
Statistical test. If None, use
"hypergeometric"whenbackgroundis None and"binomial"otherwise.- type method:
Literal[‘binomial’, ‘hypergeometric’] | None
- returns:
Enrichment tables keyed by group name. Each table contains motif id, name, family, log2 fold change, p-value, and adjusted p-value.
- rtype:
dict[str, ‘polars.DataFrame’]
Examples
>>> import snapatac2 as snap >>> motifs = snap.datasets.cis_bp(unique=True) >>> regions = {"set1": ["chr1:10000-10200", "chr1:20000-20200"]} >>> result = snap.tl.motif_enrichment(motifs[:2], regions, snap.genome.hg38) >>> list(result) ['set1']