snapatac2.pp.harmony#

snapatac2.pp.harmony(adata, *, batch, use_rep='X_spectral', use_dims=None, groupby=None, key_added=None, inplace=True, n_jobs=8, **kwargs)[source]#

Correct batch effects in an embedding with Harmony.

Use this function after dimensionality reduction and before neighbor-graph construction to align cells across experiments, samples, or other batch covariates. The function reads the input embedding from adata.obsm[use_rep] for AnnData-like input, or directly uses a NumPy array when adata is an array. Additional keyword arguments are passed to harmonypy.run_harmony.

Anti-Patterns#

  • Do NOT run Harmony on raw count matrices; provide a reduced embedding such as X_spectral.

  • Do NOT pass batch as a column name when adata is a NumPy array; provide a pandas DataFrame or Series of batch labels instead.

type adata:

AnnData | AnnDataSet | ndarray

param adata:

AnnData-like object with use_rep in .obsm, AnnDataSet-like object, or a NumPy array of shape n_obs x n_components.

type batch:

str | list[str]

param batch:

Column name or list of column names in .obs that identify batches, or a tabular batch-label object accepted by Harmony when using array input.

type use_rep:

str

param use_rep:

Key in .obsm containing the input embedding.

type use_dims:

int | list[int] | None

param use_dims:

Dimensions of use_rep or the input array to use. If an integer, use the first use_dims columns. If a list, use those column indices.

type groupby:

str | list[str] | None

param groupby:

Column name or labels used to split cells and run Harmony independently within each group.

type key_added:

str | None

param key_added:

Key used to store the corrected embedding. If None, store it in .obsm[use_rep + "_harmony"].

type inplace:

bool

param inplace:

If True and adata is AnnData-like, store the corrected embedding in .obsm. Ignored for NumPy input.

type n_jobs:

int

param n_jobs:

Number of worker processes used when groupby is specified.

type kwargs:

param kwargs:

Additional arguments passed to harmonypy.run_harmony().

returns:

Corrected embedding of shape n_obs x n_selected_components when inplace=False or when adata is a NumPy array. Returns None when inplace=True and stores the result in .obsm.

rtype:

ndarray | None

Examples

>>> import numpy as np
>>> import pandas as pd
>>> import snapatac2 as snap
>>> X = np.array([[0.0, 0.1], [0.2, 0.0], [3.0, 3.1], [3.2, 3.0]])
>>> batch = pd.DataFrame({"sample": ["a", "a", "b", "b"]})
>>> corrected = snap.pp.harmony(X, batch=batch, inplace=False, max_iter_harmony=1)
>>> corrected.shape
(4, 2)