scirpy.pp.ir_neighbors¶

scirpy.pp.ir_neighbors(adata, *, metric='identity', cutoff=None, receptor_arms='all', dual_ir='primary_only', key_added=None, sequence='nt', inplace=True, n_jobs=None)¶

Construct a neighborhood graph based on CDR3 sequence similarity.

All cells with a CDR3 distance < cutoff receive an edge in the graph. Edges are weighted by the distance.

Parameters

adata : AnnDataAnnData

annotated data matrix

metric : {‘identity’, ‘alignment’, ‘levenshtein’, ‘hamming’}, DistanceCalculatorUnion[Literal[‘identity’, ‘alignment’, ‘levenshtein’, ‘hamming’], DistanceCalculator] (default: 'identity')

You can choose one of the following metrics:

identity – 1 for identical sequences, 0 otherwise. See IdentityDistanceCalculator. This metric implies a cutoff of 0.
levenshtein – Levenshtein edit distance. See LevenshteinDistanceCalculator.
hamming – Hamming distance for CDR3 sequences of equal length. See HammingDistanceCalculator.
alignment – Distance based on pairwise sequence alignments using the BLOSUM62 matrix. This option is incompatible with nucleotide sequences. See AlignmentDistanceCalculator.
any instance of DistanceCalculator.

cutoff : int, NoneOptional[int] (default: None)

All distances > cutoff will be replaced by 0 and eliminated from the sparse: matrix. A sensible cutoff depends on the distance metric, you can find information in the corresponding docs. If set to None, the cutoff will be 10 for the alignment metric, and 2 for levenshtein and hamming. For the identity metric, the cutoff is ignored and always set to 0.

Two cells with a distance <= the cutoff will be connected. A cutoff of 0 implies the use of the identity metric.

receptor_arms:

"TRA" - only consider TRA sequences
"TRB" - only consider TRB sequences
"all" - both TRA and TRB need to match
"any" - either TRA or TRB need to match

dual_ir:

"primary_only" - only consider most abundant pair of TRA/TRB chains

"any" - consider both pairs of TRA/TRB sequences. Distance must be below cutoff for any of the chains.

"all" - consider both pairs of TRA/TRB sequences. Distance must be below cutoff for all of the chains.