snapatac2.pp.knn#

snapatac2.pp.knn(adata, n_neighbors=50, use_dims=None, use_rep='X_spectral', method='kdtree', inplace=True, random_state=0)[source]#

Build a Euclidean k-nearest-neighbor graph for observations.

Use this function after dimensionality reduction to construct the graph used by downstream clustering, embedding, or graph-based analysis. When adata is an AnnData-like object, the input matrix is read from adata.obsm[use_rep]. When adata is a NumPy array, the array itself is used and the result is always returned.

Anti-Patterns#

  • Do NOT pass a raw count matrix unless Euclidean distances on counts are the intended analysis; use a reduced representation such as X_spectral.

  • Do NOT expect random_state to make method="hora" deterministic; the HNSW backend currently ignores this value.

type adata:

AnnData | AnnDataSet | ndarray

param adata:

AnnData-like object with use_rep in .obsm, AnnDataSet-like object, or a NumPy array of shape n_obs x n_features.

type n_neighbors:

int

param n_neighbors:

Number of nearest neighbors to store for each observation.

type use_dims:

int | list[int] | None

param use_dims:

Dimensions of use_rep or the input array to use. If an integer, use the first use_dims columns. If a list, use those column indices.

type use_rep:

str

param use_rep:

Key in .obsm containing the representation to search.

type method:

Literal['kdtree', 'hora', 'pynndescent']

param method:

Neighbor-search backend. Use "kdtree" for exact search, "hora" for approximate HNSW search, or "pynndescent" for approximate NNDescent.

type inplace:

bool

param inplace:

If True and adata is AnnData-like, store the graph in .obsp["distances"]. Ignored for NumPy input.

type random_state:

int

param random_state:

Random seed used only by method="pynndescent".

returns:

Sparse distance matrix of shape n_obs x n_obs when inplace=False or when adata is a NumPy array. Returns None when inplace=True and stores the matrix in .obsp["distances"].

rtype:

csr_matrix | None

Examples

>>> import numpy as np
>>> import snapatac2 as snap
>>> X = np.array([[0.0, 0.0], [0.1, 0.0], [2.0, 2.0], [2.1, 2.0]])
>>> graph = snap.pp.knn(X, n_neighbors=2, method="kdtree")
>>> graph.shape
(4, 4)