Datasets#

These functions facilitate the download of public datasets and auxiliary data used in the SnapATAC2 package.

Note

You can change the data cache directory by setting the SNAP_DATA_DIR environmental variable.

Genomes#

genome.Genome(*, fasta, annotation[, ...])

A class that encapsulates information about a genome, including its FASTA sequence, its annotation, and chromosome sizes.

genome.GRCh37

A class that encapsulates information about a genome, including its FASTA sequence, its annotation, and chromosome sizes.

genome.GRCh38

A class that encapsulates information about a genome, including its FASTA sequence, its annotation, and chromosome sizes.

genome.GRCm38

A class that encapsulates information about a genome, including its FASTA sequence, its annotation, and chromosome sizes.

genome.GRCm39

A class that encapsulates information about a genome, including its FASTA sequence, its annotation, and chromosome sizes.

genome.hg19

A class that encapsulates information about a genome, including its FASTA sequence, its annotation, and chromosome sizes.

genome.hg38

A class that encapsulates information about a genome, including its FASTA sequence, its annotation, and chromosome sizes.

genome.mm10

A class that encapsulates information about a genome, including its FASTA sequence, its annotation, and chromosome sizes.

genome.mm39

A class that encapsulates information about a genome, including its FASTA sequence, its annotation, and chromosome sizes.

Motifs#

datasets.cis_bp([unique])

Fetch CIS-BP transcription factor motifs for motif analysis.

datasets.Meuleman_2020()

Fetch grouped transcription factor motifs from Meuleman 2020.

Raw data#

datasets.pbmc500([type, downsample])

Fetch the 10x Genomics 500 PBMC scATAC-seq example dataset.

datasets.pbmc5k([type])

Fetch the 10x Genomics 5k PBMC scATAC-seq example dataset.

datasets.pbmc10k_multiome([modality, type])

Fetch the 10x Genomics 10k PBMC multiome example dataset.

datasets.colon()

Fetch five transverse colon scATAC-seq fragment datasets.

datasets.cre_HEA()

Fetch the curated human colon cis-regulatory element BED file.