scirpy.tl.define_clonotypes¶

scirpy.tl.define_clonotypes(adata, *, key_added='clonotype', **kwargs)¶

Define clonotypes based on CDR3 nucleic acid sequence identity.

As opposed to define_clonotype_clusters() which employs a more flexible definition of clonotype clusters, this function stringently defines clonotypes based on nucleic acid sequence identity. Technically, this function is an alias to define_clonotype_clusters() with different default parameters.

Requires running scirpy.pp.tcr_neighbors() with sequence='nt' and metric='identity first (which are the default parameters).

Parameters

adata

Annotated data matrix

key_added

Name of the columns which will be added to adata.obs if inplace is True. Will create the columns {key_added} and {key_added}_size.

same_v_gene

Enforces clonotypes to have the same V-genes. This is useful as the CDR1 and CDR2 regions are fully encoded in this gene. See CDR for more details.

Possible values are

False - Ignore V-gene during clonotype definition

"primary_only" - Only the V-genes of the primary pair of alpha and beta chains needs to match

"all" - All V-genes of all sequences need to match.

Chains with no detected V-gene will be treated like a separate “gene” with the name “None”.

partitions

How to find graph partitions that define a clonotype. Possible values are leiden, for using the “Leiden” algorithm and connected to find fully connected sub-graphs.

The difference is that the Leiden algorithm further divides fully connected subgraphs into highly-connected modules.

resolution

resolution parameter for the leiden algorithm.

n_iterations

n_iterations parameter for the leiden algorithm.

neighbors_key

Key under which the neighboorhood graph is stored in adata.uns. By default, tries to read from tcr_neighbors_{sequence}_{metric}, e.g. tcr_neighbors_nt_identity.

inplace

If True, adds the results to anndata, otherwise returns them.

Return type

Tuple[ndarray, ndarray], NoneOptional[Tuple[ndarray, ndarray]]

Returns

clonotypendarrayndarray: an array containing the clonotype id for each cell
clonotype_sizendarrayndarray: an array containing the number of cells in the respective clonotype for each cell.