snapatac2.datasets.pbmc5k#

snapatac2.datasets.pbmc5k(type='fragment')[source]#

Fetch the 10x Genomics 5k PBMC scATAC-seq example dataset.

Use this helper to download and cache a fragments file, a preprocessed h5ad file, or an annotated h5ad file for PBMC analysis examples. Set the SNAP_DATA_DIR environment variable before calling this function to control where downloaded files are cached.

Anti-Patterns#

  • Do NOT pass the returned h5ad path to fragment-import functions; use snap.read(...) for type="h5ad" and type="annotated_h5ad".

param type:

Dataset representation to fetch. Use “fragment” for a fragments TSV.GZ file, “h5ad” for a preprocessed AnnData file, or “annotated_h5ad” for a preprocessed AnnData file with cell annotations.

type type:

Literal['fragment', 'h5ad', 'annotated_h5ad']

returns:

Path to the requested cached dataset file.

rtype:

Path

Examples

>>> import snapatac2 as snap
>>> h5ad_file = snap.datasets.pbmc5k(type="annotated_h5ad")
>>> data = snap.read(h5ad_file, backed="r")
>>> data.n_obs > 0
True