Preprocessing: pp#

BAM/Fragment file processing#

pp.make_fragment_file(bam_file, output_file)

Convert a BAM file into a sorted fragment file.

pp.import_fragments(fragment_file, ...[, ...])

Import fragment files and compute basic QC metrics.

pp.import_values(input_dir, chrom_sizes, *)

Import base-pair values into an AnnData object.

pp.import_contacts(contact_file, chrom_sizes, *)

Import chromatin contacts into an AnnData object.

pp.call_cells(data, use_rep[, inplace, n_jobs])

Call valid cells from feature counts using the OrdMag algorithm.

Matrix operation#

pp.add_tile_matrix(adata, *[, bin_size, ...])

Generate a cell-by-genomic-bin count matrix.

pp.make_peak_matrix(adata, *[, use_rep, ...])

Generate a cell-by-peak count matrix.

pp.make_gene_matrix(adata, gene_anno, *[, ...])

Generate a cell-by-gene activity matrix.

pp.filter_cells(data[, min_counts, ...])

Filter cells by fragment-count and TSS-enrichment QC thresholds.

pp.select_features(adata[, n_features, ...])

Select informative genomic features for downstream analysis.

pp.knn(adata[, n_neighbors, use_dims, ...])

Build a Euclidean k-nearest-neighbor graph for observations.

Doublet removal#

pp.scrublet(adata[, features, n_comps, ...])

Score ATAC-seq cells for doublet likelihood with Scrublet.

pp.filter_doublets(adata[, ...])

Remove cells classified as doublets.

Data Integration#

pp.mnc_correct(adata, *, batch[, ...])

Correct batch effects with centroid-based mutual nearest neighbors.

pp.harmony(adata, *, batch[, use_rep, ...])

Correct batch effects in an embedding with Harmony.

pp.scanorama_integrate(adata, *, batch[, ...])

Integrate batch-specific embeddings with Scanorama.