snapatac2.tl.init_network_from_annotation#
- snapatac2.tl.init_network_from_annotation(regions, anno_file, upstream=250000, downstream=250000, id_type='gene_name', coding_gene_only=True)[source]#
Build a region-to-gene network from gene annotations.
Use this function to connect candidate cis-regulatory elements to genes when the regions fall within an annotation-derived regulatory domain around each transcription start site.
Anti-Patterns#
Do NOT pass regions from a genome build different from
anno_file.Do NOT assume edges are functional regulatory links; they encode genomic proximity only until scores are added.
- param regions:
Candidate regulatory regions in
chrom:start-endformat.- type regions:
- param anno_file:
GFF/GTF annotation file, or a Genome object containing the annotation.
- type anno_file:
- param upstream:
Bases upstream of each transcription start site included in the regulatory domain.
- type upstream:
- param downstream:
Bases downstream of each transcription start site included in the regulatory domain.
- type downstream:
- param id_type:
Annotation identifier stored on gene nodes.
- type id_type:
Literal['gene_name','gene_id','transcript_id']- param coding_gene_only:
If True, retain only protein-coding genes.
- type coding_gene_only:
- returns:
Directed graph whose region nodes point to nearby gene nodes.
- rtype:
PyDiGraph
Examples
>>> import snapatac2 as snap >>> regions = ["chr1:10000-10500", "chr1:20000-20500"] >>> network = snap.tl.init_network_from_annotation(regions, snap.genome.hg38) >>> network.num_nodes() >= 0 True