snapatac2.datasets.pbmc500#
- snapatac2.datasets.pbmc500(type='fragment', downsample=False)[source]#
Fetch the 10x Genomics 500 PBMC scATAC-seq example dataset.
Use this helper to download and cache the fragment, BAM, or FASTQ files for a small PBMC dataset suitable for tutorials and smoke tests. Set the
SNAP_DATA_DIRenvironment variable before calling this function to control where downloaded files are cached.Anti-Patterns#
Do NOT use the default full fragment file for fast examples; pass
downsample=Truewhen a small fragment file is sufficient.Do NOT set
downsample=Truewithtype="bam"ortype="fastq"; the downsampled file is only available fortype="fragment".
- param type:
File type to fetch. Use “fragment” for a fragments TSV.GZ file, “bam” for the position-sorted BAM file, or “fastq” for the extracted FASTQ files from the downloaded archive.
- type type:
Literal['fastq','bam','fragment']- param downsample:
If True and
type="fragment", fetch the smaller downsampled fragments file instead of the full fragments file.- type downsample:
- returns:
Path to the requested fragment or BAM file. For
type="fastq", returns a list of paths to the extracted FASTQ files.- rtype:
Examples
>>> import snapatac2 as snap >>> fragment_file = snap.datasets.pbmc500(downsample=True) >>> fragment_file.name 'atac_pbmc_500_downsample.tsv.gz'