scirpy.pp.ir_dist¶
-
scirpy.pp.ir_dist(adata, *, metric='identity', cutoff=None, sequence='nt', key_added=None, inplace=True, n_jobs=None)¶ Computes a sequence-distance metric between all unique VJ CDR3 sequences and between all unique VDJ CDR3 sequences.
This is a required proprocessing step for clonotype definition and clonotype networks.
Calculates the full pairwise distance matrix.
Important
Distances are offset by 1 to allow efficient use of sparse matrices (\(d' = d+1\)).
That means, a
distance > cutoffis represented as0, adistance == 0is represented as1, adistance == 1is represented as2and so on.Only returns distances
<= cutoff. Larger distances are eliminated from the sparse matrix.Distances are non-negative.
- Parameters
- adata :
AnnDataAnnData annotated data matrix
- metric : {‘alignment’, ‘identity’, ‘levenshtein’, ‘hamming’} |
DistanceCalculatorUnion[Literal[‘alignment’, ‘identity’, ‘levenshtein’, ‘hamming’],DistanceCalculator] (default:'identity') - You can choose one of the following metrics:
identity– 1 for identical sequences, 0 otherwise. SeeIdentityDistanceCalculator. This metric implies a cutoff of 0.levenshtein– Levenshtein edit distance. SeeLevenshteinDistanceCalculator.hamming– Hamming distance for CDR3 sequences of equal length. SeeHammingDistanceCalculator.alignment– Distance based on pairwise sequence alignments using the BLOSUM62 matrix. This option is incompatible with nucleotide sequences. SeeAlignmentDistanceCalculator.any instance of
DistanceCalculator.
- cutoff :
int|NoneOptional[int] (default:None) All distances
> cutoffwill be replaced by0and eliminated from the sparse matrix. A sensible cutoff depends on the distance metric, you can find information in the corresponding docs. If set toNone, the cutoff will be10for thealignmentmetric, and2forlevenshteinandhamming. For the identity metric, the cutoff is ignored and always set to0.- sequence : {‘aa’, ‘nt’}
Literal[‘aa’, ‘nt’] (default:'nt') Compute distances based on amino acid (
aa) or nucleotide (nt) sequences.- key_added :
str|NoneOptional[str] (default:None) Dictionary key under which the results will be stored in
adata.unsifinplace=True. Defaults toir_dist_{sequence}_{metric}. Ifmetricis an instance ofscirpy.ir_dist.metrics.DistanceCalculator,{metric}defaults tocustom.- inplace :
boolbool(default:True) If true, store the result in
adata.uns. Otherwise return a dictionary with the results.- n_jobs :
int|NoneOptional[int] (default:None) Number of cores to use for distance calculation. Passed on to
scirpy.ir_dist.metrics.DistanceCalculator.
- adata :
- Return type
- Returns
Depending on the value of
inplaceeither returns nothing or a dictionary with symmetrical, sparse, pairwise distance matrices for allVJandVDJsequences.