snapatac2.pp.mnc_correct#

snapatac2.pp.mnc_correct(adata, *, batch, n_neighbors=5, n_clusters=40, n_iter=1, use_rep='X_spectral', use_dims=None, groupby=None, key_added=None, inplace=True, n_jobs=8)[source]#

Correct batch effects with centroid-based mutual nearest neighbors.

Use this function after dimensionality reduction and before neighbor-graph construction to align cells across batches. The method clusters each batch, identifies mutual nearest cluster centroids, and projects cells along the resulting correction vectors.

Anti-Patterns#

Do NOT run this function on raw count matrices unless distances between raw counts are the intended analysis; use a reduced representation such as X_spectral.
Do NOT pass batch as a column name when adata is a NumPy array; provide one label per observation instead.

type adata:
param adata:: AnnData-like object with use_rep in .obsm, AnnDataSet-like object, or a NumPy array of shape n_obs x n_components.
type batch:
param batch:: Column name in .obs that identifies batches, or a list of labels with one entry per observation.
type n_neighbors:
param n_neighbors:: Number of nearest centroids to inspect when finding mutual nearest neighbors.
type n_clusters:
param n_clusters:: Maximum number of clusters to form in each batch.
type n_iter:
param n_iter:: Number of correction iterations.
type use_rep:
param use_rep:: Key in .obsm containing the input embedding.
type use_dims:
param use_dims:: Dimensions of use_rep or the input array to use. If an integer, use the first use_dims columns. If a list, use those column indices.
type groupby:
param groupby:: Column name or labels used to split cells and run correction independently within each group.
type key_added:
param key_added:: Key used to store the corrected embedding. If None, store it in .obsm[use_rep + "_mnn"].
type inplace:
param inplace:: If True and adata is AnnData-like, store the corrected embedding in .obsm. Ignored for NumPy input.
type n_jobs:
param n_jobs:: Number of worker processes used when groupby is specified.
returns:: Corrected embedding of shape n_obs x n_selected_components when inplace=False or when adata is a NumPy array. Returns None when inplace=True and stores the result in .obsm.
rtype:: np.ndarray | None

Examples

>>> import numpy as np
>>> import snapatac2 as snap
>>> X = np.array([[0.0, 0.1], [0.2, 0.0], [3.0, 3.1], [3.2, 3.0]])
>>> batch = ["a", "a", "b", "b"]
>>> corrected = snap.pp.mnc_correct(X, batch=batch, n_clusters=2, inplace=False)
>>> corrected.shape
(4, 2)