snapatac2.tl.kmeans#

snapatac2.tl.kmeans(adata, n_clusters, n_iterations=-1, random_state=0, use_rep='X_spectral', key_added='kmeans', inplace=True)[source]#

Cluster cells with k-means.

Use this function on a dense embedding such as adata.obsm["X_spectral"], or pass a NumPy array directly and set inplace=False.

Anti-Patterns#

  • Do NOT pass a raw count matrix unless k-means on counts is intended; use a normalized embedding for typical single-cell workflows.

  • Do NOT rely on n_iterations or random_state to alter the current Rust backend call; they are retained in the API but not forwarded here.

param adata:

Annotated data object containing adata.obsm[use_rep], or a numeric matrix with cells as rows.

type adata:

AnnData | AnnDataSet | ndarray

param n_clusters:

Number of clusters to compute.

type n_clusters:

int

param n_iterations:

API parameter reserved for k-means iteration control.

type n_iterations:

int

param random_state:

API parameter reserved for initialization control.

type random_state:

int

param use_rep:

Key in adata.obsm containing the input embedding.

type use_rep:

str

param key_added:

Key in adata.obs used to store cluster labels.

type key_added:

str

param inplace:

If True, store labels in adata.obs[key_added]; if False, return them.

type inplace:

bool

returns:

If inplace=True, stores categorical labels in adata.obs[key_added] and returns None. If inplace=False, returns a string array of labels.

rtype:

ndarray | None

Examples

>>> import numpy as np
>>> import snapatac2 as snap
>>> X = np.random.default_rng(0).normal(size=(12, 3))
>>> labels = snap.tl.kmeans(X, n_clusters=3, inplace=False)
>>> labels.shape
(12,)