snapatac2.pp.import_contacts#
- snapatac2.pp.import_contacts(contact_file, chrom_sizes, *, file=None, sorted_by_barcode=True, bin_size=500000, chunk_size=200, tempdir=None, backend='hdf5')[source]#
Import chromatin contacts into an AnnData object.
Use this function to load single-cell chromatin-contact records and bin them into fixed-width genomic intervals. The result can be kept in memory or written to a backed h5ad file.
Anti-Patterns#
Do NOT set
sorted_by_barcode=Truefor unsorted contact files; set it toFalseso this function sorts them first.
- type contact_file:
- param contact_file:
Path to the contact file.
- type file:
- param file:
File name of the output h5ad file used to store the result. If provided, result will be saved to a backed AnnData, otherwise an in-memory AnnData is used.
- type chrom_sizes:
- param chrom_sizes:
A Genome object or a dictionary containing chromosome sizes, for example,
{"chr1": 2393, "chr2": 2344, ...}.- type sorted_by_barcode:
- param sorted_by_barcode:
Whether the contact file has been sorted by cell barcodes.
- type bin_size:
- param bin_size:
The size of consecutive genomic regions used to record the counts.
- type chunk_size:
- param chunk_size:
Increasing the chunk_size speeds up I/O but uses more memory.
- type tempdir:
- param tempdir:
Location to store temporary files. If
None, system temporary directory will be used.- type backend:
Literal['hdf5']- param backend:
The backend.
- returns:
An annotated data matrix of shape
n_obsxn_vars. Rows correspond to cells and columns to regions. Iffile=None, an in-memory AnnData will be returned, otherwise a backed AnnData is returned.- rtype:
AnnData
Examples
>>> from pathlib import Path >>> import tempfile >>> import snapatac2 as snap >>> tmp = tempfile.TemporaryDirectory() >>> contact_file = Path(tmp.name) / "contacts.tsv" >>> _ = contact_file.write_text("cell1\tchr1\t10\tchr1\t40\t1\n") >>> data = snap.pp.import_contacts(contact_file, chrom_sizes={"chr1": 1000}) >>> data.n_obs 1 >>> tmp.cleanup()