scirpy.ir_dist.metrics.LevenshteinDistanceCalculator
- class scirpy.ir_dist.metrics.LevenshteinDistanceCalculator(cutoff=None, **kwargs)
Calculates the Levenshtein edit-distance between sequences.
The edit distance is the total number of deletion, addition and modification events.
This class relies on Python-levenshtein to calculate the distances.
- Choosing a cutoff:
Each modification stands for a deletion, addition or modification event. While lacking empirical data, it seems unlikely that CDR3 sequences with more than two modifications still recognize the same antigen.
- Parameters
- cutoff :
int
|None
Optional
[int
] (default:None
) Will eleminate distances > cutoff to make efficient use of sparse matrices. The default cutoff is
2
.- n_jobs
Number of jobs to use for the pairwise distance calculation. If None, use all jobs (only for ParallelDistanceCalculators).
- block_size
The width of a block of the matrix that will be delegated to a worker process. The block contains
block_size ** 2
elements.
- cutoff :
Attributes
The sparse matrix dtype.
Methods
calc_dist_mat
(seqs[, seqs2])Calculate the distance matrix.
squarify
(triangular_matrix)Mirror a triangular matrix at the diagonal to make it a square matrix.