scirpy.pp.ir_neighbors¶
-
scirpy.pp.
ir_neighbors
(adata, *, metric='identity', cutoff=None, receptor_arms='all', dual_ir='primary_only', key_added=None, sequence='nt', inplace=True, n_jobs=None)¶ Construct a neighborhood graph based on CDR3 sequence similarity.
All cells with a CDR3 distance
< cutoff
receive an edge in the graph. Edges are weighted by the distance.- Parameters
- adata :
AnnData
AnnData
annotated data matrix
- metric : {‘identity’, ‘alignment’, ‘levenshtein’, ‘hamming’},
DistanceCalculator
Union
[Literal
[‘identity’, ‘alignment’, ‘levenshtein’, ‘hamming’],DistanceCalculator
] (default:'identity'
) - You can choose one of the following metrics:
identity
– 1 for identical sequences, 0 otherwise. SeeIdentityDistanceCalculator
. This metric implies a cutoff of 0.levenshtein
– Levenshtein edit distance. SeeLevenshteinDistanceCalculator
.hamming
– Hamming distance for CDR3 sequences of equal length. SeeHammingDistanceCalculator
.alignment
– Distance based on pairwise sequence alignments using the BLOSUM62 matrix. This option is incompatible with nucleotide sequences. SeeAlignmentDistanceCalculator
.any instance of
DistanceCalculator
.
- cutoff :
int
,None
Optional
[int
] (default:None
) - All distances
> cutoff
will be replaced by0
and eliminated from the sparse matrix. A sensible cutoff depends on the distance metric, you can find information in the corresponding docs. If set to
None
, the cutoff will be10
for thealignment
metric, and2
forlevenshtein
andhamming
. For the identity metric, the cutoff is ignored and always set to0
.
Two cells with a distance <= the cutoff will be connected. A cutoff of 0 implies the use of the
identity
metric.- All distances
- adata :
- receptor_arms:
"TRA"
- only consider TRA sequences"TRB"
- only consider TRB sequences"all"
- both TRA and TRB need to match"any"
- either TRA or TRB need to match
- dual_ir:
"primary_only"
- only consider most abundant pair of TRA/TRB chains"any"
- consider both pairs of TRA/TRB sequences. Distance must be below cutoff for any of the chains."all"
- consider both pairs of TRA/TRB sequences. Distance must be below cutoff for all of the chains.
See also Dual IR
- key_added:
dict key under which the result will be stored in
adata.uns
wheninplace
is True. Defaults toir_neighbors_{sequence}_{metric}
.If metric is an instance of
scirpy.ir_dist.DistanceCalculator
,{metric}
defaults to"custom"
.- sequence:
Use amino acid (
aa
) or nulceotide (nt
) sequences?- inplace:
If
True
, store the results inadata.uns
. Otherwise return the results.- n_jobs:
Number of cores to use for alignment, levenshtein or hamming distance.
- Return type
Tuple
[csr_matrix
,csr_matrix
],None
Optional
[Tuple
[csr_matrix
,csr_matrix
]]- Returns
- connectivities
csr_matrix
csr_matrix
weighted adjacency matrix
- dist
csr_matrix
csr_matrix
cell x cell distance matrix with the distances as computed according to
metric
offsetted by 1 to make use of sparse matrices.
- connectivities