snapatac2.ex.export_coverage#

snapatac2.ex.export_coverage(adata, groupby, selections=None, bin_size=10, blacklist=None, normalization='RPKM', include_for_norm=None, exclude_for_norm=None, min_frag_length=None, max_frag_length=2000, counting_strategy='fragment', smooth_base=None, out_dir='./', prefix='', suffix='.bw', output_format=None, compression=None, compression_level=None, tempdir=None, n_jobs=8)[source]#

Export grouped genome-wide coverage tracks.

Use this function after importing fragments to write one bedGraph or bigWig coverage track per cell group. Coverage is counted in fixed-width genomic bins, optionally filtered by fragment length, smoothed, and normalized. Disable normalization with normalization=None.

Anti-Patterns#

  • Do NOT pass an AnnData object without fragment metadata created by import_fragments.

  • Do NOT use include_for_norm and exclude_for_norm as peak-calling filters; they only define which fragments contribute to normalization.

  • Do NOT rely on suffix inference for custom extensions; pass output_format and compression explicitly.

../../_images/func%2Bexport_coverage.svg
param adata:

Annotated data object with n_obs cells and fragment metadata.

type adata:

AnnData | AnnDataSet

param groupby:

Group assignment for each cell. If a string, values are read from adata.obs[groupby]. If a list, it must contain one group label per cell in observation order.

type groupby:

str | list[str]

param selections:

Group names to export. If None, export every group found in groupby.

type selections:

list[str] | None

param bin_size:

Width, in bases, of each coverage bin.

type bin_size:

int

param blacklist:

BED file of regions to exclude from coverage output.

type blacklist:

Path | None

param normalization:

Coverage normalization method. Use None to export raw counts. RPKM divides each bin by mapped reads in millions and bin length in kilobases; CPM divides by mapped reads in millions; BPM divides by the sum of all binned reads in millions.

type normalization:

Optional[Literal['RPKM', 'CPM', 'BPM']]

param include_for_norm:

Genomic intervals or BED file of intervals to include when computing the normalization denominator. If None, include all non-excluded fragments.

type include_for_norm:

list[str] | Path

param exclude_for_norm:

Genomic intervals or BED file of intervals to exclude when computing the normalization denominator. If a fragment overlaps both included and excluded intervals, it is excluded.

type exclude_for_norm:

list[str] | Path

param min_frag_length:

Minimum fragment length to count. If None, do not apply a minimum.

type min_frag_length:

int | None

param max_frag_length:

Maximum fragment length to count. If None, do not apply a maximum.

type max_frag_length:

int | None

param counting_strategy:

Counting mode. Use “fragment” to count overlapping fragments or “insertion” to count transposition insertion sites.

type counting_strategy:

Literal['fragment', 'insertion']

param smooth_base:

Width, in bases, of the smoothing window. If None, do not smooth.

type smooth_base:

int | None

param out_dir:

Directory where output files are written.

type out_dir:

Path

param prefix:

Text prepended to each output filename.

type prefix:

str

param suffix:

Text appended to each output filename. Used to infer output format and compression when the corresponding arguments are None.

type suffix:

str

param output_format:

Coverage-track format. If None, infer it from suffix.

type output_format:

Optional[Literal['bedgraph', 'bigwig']]

param compression:

Compression codec for compressed bedGraph output. If None, infer it from suffix.

type compression:

Optional[Literal['gzip', 'zstandard']]

param compression_level:

Compression level. Use 1-9 for gzip or 1-22 for zstandard. If None, use the backend default: 6 for gzip or 3 for zstandard.

type compression_level:

int | None

param tempdir:

Directory for temporary files created during export. If None, use the system temporary directory.

type tempdir:

Path | None

param n_jobs:

Number of worker threads. If n_jobs <= 0, use all available threads.

type n_jobs:

int

returns:

Mapping from group name to output filename.

rtype:

dict[str, str]

See also

export_fragments

Examples

>>> import snapatac2 as snap
>>> data = snap.read(snap.datasets.pbmc5k(type="annotated_h5ad"), backed='r')
>>> snap.ex.export_coverage(
...     data,
...     groupby='cell_type',
...     selections=['Naive B'],
...     suffix='.bedgraph.zst',
... )
{'Naive B': './Naive B.bedgraph.zst'}