snapatac2.pp.scanorama_integrate#

snapatac2.pp.scanorama_integrate(adata, *, batch, n_neighbors=20, use_rep='X_spectral', use_dims=None, groupby=None, key_added=None, sigma=15, approx=True, alpha=0.1, batch_size=5000, inplace=True, **kwargs)[source]#

Integrate batch-specific embeddings with Scanorama.

Use this function after snap.tl.spectral and before snap.pp.knn to align cells from multiple batches. The function reads the input embedding from adata.obsm[use_rep] for AnnData-like input, or directly uses a NumPy array when adata is an array. It uses the Scanorama implementation from brianhie/scanorama.

Anti-Patterns#

  • Do NOT run Scanorama on raw count matrices; provide a reduced embedding such as X_spectral.

  • Do NOT pass batch as a column name when adata is a NumPy array; provide one label per observation instead.

type adata:

param adata:

AnnData-like object with use_rep in .obsm, AnnDataSet-like object, or a NumPy array of shape n_obs x n_components.

type batch:

param batch:

Column name in .obs that identifies batches, or a list of labels with one entry per observation.

type n_neighbors:

param n_neighbors:

Number of mutual nearest neighbors used by Scanorama.

type use_rep:

param use_rep:

Key in .obsm containing the input embedding.

type use_dims:

param use_dims:

Dimensions of use_rep or the input array to use. If an integer, use the first use_dims columns. If a list, use those column indices.

type groupby:

param groupby:

Column name or labels used to split cells and run Scanorama independently within each group.

type key_added:

param key_added:

Key used to store the corrected embedding. If None, store it in .obsm[use_rep + "_scanorama"].

type sigma:

param sigma:

Gaussian kernel width passed to Scanorama.

type approx:

param approx:

Whether Scanorama uses approximate nearest-neighbor search.

type alpha:

param alpha:

Alignment score cutoff passed to Scanorama.

type batch_size:

param batch_size:

Batch size passed to Scanorama for nearest-neighbor search.

type inplace:

param inplace:

If True and adata is AnnData-like, store the corrected embedding in .obsm. Ignored for NumPy input.

type kwargs:

param kwargs:

Additional arguments passed to scanorama.assemble().

returns:

Corrected embedding of shape n_obs x n_selected_components when inplace=False or when adata is a NumPy array. Returns None when inplace=True and stores the result in .obsm.

rtype:

np.ndarray | None

See also

spectral

compute spectral embedding of the data matrix.

Examples

>>> import snapatac2 as snap
>>> adata = snap.read(snap.datasets.pbmc5k(type='h5ad'), backed=None)
>>> snap.pp.select_features(adata)
>>> snap.tl.spectral(adata)
>>> midpoint = adata.n_obs // 2
>>> adata.obs['batch'] = ['a'] * midpoint + ['b'] * (adata.n_obs - midpoint)
>>> snap.pp.scanorama_integrate(adata, batch='batch')
>>> 'X_spectral_scanorama' in adata.obsm
True