Glossary

TCR

T-cell receptor. A TCR consists of one α and one β chain (or, alternatively, one γ and one δ chain). Each chain consists of a constant and a variable region. The variable region is responsible for antigen recognition, mediated by CDR regions.

Scirpy currently only supports α/β-TCRs. For more information, see the page about our TCR model.

_images/tcr.jpg

Image from Wikimedia commons under the CC BY-3.0 license.

Clonotype

A clonotype designates a collection of T or B cells that descend from a common, antecedent cell, and therefore, bear the same adaptive immune receptors and recognize the same epitopes.

In single-cell RNA-sequencing (scRNA-seq) data, T cells sharing identical complementarity-determining regions 3 (CDR3) nucleotide sequences of both α and β TCR chains make up a clonotype.

Scirpy provides a flexible approach to clonotype definition based on CDR3 sequence identity or similarity. Additionally, it is possible to require clonotypes to have the same V-gene, enforcing the CDR 1 and 2 regions to be the same.

For more details, see the page about our TCR model and the API documentation of scirpy.tl.define_clonotypes().

Clonotype cluster

A higher-order aggregation of clonotypes that have different CDR3 nucleotide sequences, but might recognize the same antigen because they have the same or similar CDR3 amino acid sequence.

See also: scirpy.tl.define_clonotype_clusters().

Private clonotype

A clonotype that is specific for a certain patient.

Public clonotype

A clonotype that is shared across multiple patients, e.g. a clonotype recognizing common viral epitope.

_images/public-private.jpg

Image from [SMR+18] under the CC BY-4.0 license.

Tissue-specific clonotype

A clonotype that only occurs in a certain tissue of a certain patient.

Multi-tissue clonotype

A clonotype that occurs in multiple tissues of the same patient.

Epitope

The part of an antigen that is recognized by the TCR (or B-cell receptor, or antibody).

CDR3

Complementary-determining region 3. See CDR.

CDR

Complementary-determining region. The diversity and, therefore, antigen-specificity of TCRs is predominanly determined by three hypervariable loops (CDR1, CDR2, and CDR3) on each of the α- and β receptor arms.

CDR1 and CDR2 are fully encoded in germline V genes. In contrast, the CDR3 loops are assembled from V and J segments (TCR-α) and V, D and J segments (TCR-β) and comprise random additions and deletions at the junction sites (see also V(D)J). Thus, CDR3 regions make up a large part of the TCR variability and are therefore thought to be particularly important for antigen specificity (reviewed in [AHS15]).

_images/tcr_cdr3.png

Image from [AHS15] under the CC BY-NC-SA-3.0 license.

V(D)J

The variability of TCR chain sequences originates from the genetic recombination of Variable, Diversity and Joining gene segments. The TCR-α chain gets assembled from V and J loci only, the TCR-β chain from all three V, D and J loci.

As an example, the figure below shows how a TCR-α chain is assembed from the tra locus. V to J recombination joins one of many TRAV segments to one of many TRAJ segments. Next, introns are spliced out, resulting in a TCR-α chain transcript with V, J and C segments directly next to each other (reviewed in [AHS15]).

_images/vdj.png

Image from [AHS15] under the CC BY-NC-SA-3.0 license.

Dual TCR

TCRs with more than one pair of α- and β chains. While this was previously thought to be impossible due to the mechanism of allelic exclusion ([BSB10]), there is an increasing amound of evidence for a bona fide dual-TCR population ([SB19], [JPG10]).

For more information on how Scirpy handles dual TCRs, see the page about our TCR model.

Multichain-cell

Cells with more than two α- and β chains that do not fit into the Dual TCR model. These are usually rare and could be explained by doublets/multiplets, i.e. two ore more cells that were captured in the same droplet.

_images/multichain.png

(a) UMAP plot of 96,000 cells from [WMdA+20] with at least one detected CDR3 sequence with multichain-cells (n=474) highlighted in green. (b) Comparison of detected reads per cell in multichain-cells and other cells. Multichain cells comprised significantly more reads per cell (p = 9.45 × 10−251, Wilcoxon-Mann-Whitney-test), supporting the hypothesis that (most of) multichain cells are technical artifacts arising from cell-multiplets ([IKK+16]).

Orphan chain

A TCR chain is called orphan, if its corresponding counterpart has not been detected. For instance, if a cell has only a TCR-α chain, but no TCR-β chain, the cell will be flagged as “Orphan alpha”.

Orphan chains are most likley the effect of stochastic dropouts due to sequencing inefficiencies.

See also scirpy.tl.chain_pairing().

UMI

Unique molecular identifier. Some single-cell RNA-seq protocols label each RNA with a unique barcode prior to PCR-amplification to mitigate PCR bias. With these protocols, UMI-counts replace the read-counts generally used with RNA-seq.

productive chain

Productive chains are TCR chains with a CDR3 sequence that produces a functional peptide. Scirpy relies on the preprocessing tools (e.g. CellRanger or TraCeR) for flagging non-productive chains. Typically chains are flagged as non-productive if they contain a stop codon or are not within the reading frame.