scirpy.pp.ir_neighbors

scirpy.pp.ir_neighbors(adata, *, metric='identity', cutoff=None, receptor_arms='all', dual_ir='primary_only', key_added=None, sequence='nt', inplace=True, n_jobs=None)

Construct a neighborhood graph based on CDR3 sequence similarity.

All cells with a CDR3 distance < cutoff receive an edge in the graph. Edges are weighted by the distance.

Parameters
adata : AnnDataAnnData

annotated data matrix

metric : {‘identity’, ‘alignment’, ‘levenshtein’, ‘hamming’}, DistanceCalculatorUnion[Literal[‘identity’, ‘alignment’, ‘levenshtein’, ‘hamming’], DistanceCalculator] (default: 'identity')

You can choose one of the following metrics:

cutoff : int, NoneOptional[int] (default: None)

All distances > cutoff will be replaced by 0 and eliminated from the sparse

matrix. A sensible cutoff depends on the distance metric, you can find information in the corresponding docs. If set to None, the cutoff will be 10 for the alignment metric, and 2 for levenshtein and hamming. For the identity metric, the cutoff is ignored and always set to 0.

Two cells with a distance <= the cutoff will be connected. A cutoff of 0 implies the use of the identity metric.

receptor_arms:
  • "TRA" - only consider TRA sequences

  • "TRB" - only consider TRB sequences

  • "all" - both TRA and TRB need to match

  • "any" - either TRA or TRB need to match

dual_ir:
  • "primary_only" - only consider most abundant pair of TRA/TRB chains

  • "any" - consider both pairs of TRA/TRB sequences. Distance must be below cutoff for any of the chains.

  • "all" - consider both pairs of TRA/TRB sequences. Distance must be below cutoff for all of the chains.

See also Dual IR

key_added:

dict key under which the result will be stored in adata.uns when inplace is True. Defaults to ir_neighbors_{sequence}_{metric}.

If metric is an instance of scirpy.ir_dist.DistanceCalculator, {metric} defaults to "custom".

sequence:

Use amino acid (aa) or nulceotide (nt) sequences?

inplace:

If True, store the results in adata.uns. Otherwise return the results.

n_jobs:

Number of cores to use for alignment, levenshtein or hamming distance.

Return type

Tuple[csr_matrix, csr_matrix], NoneOptional[Tuple[csr_matrix, csr_matrix]]

Returns

connectivitiescsr_matrixcsr_matrix

weighted adjacency matrix

distcsr_matrixcsr_matrix

cell x cell distance matrix with the distances as computed according to metric offsetted by 1 to make use of sparse matrices.