API

Import scirpy together with scanpy as

import scanpy as sc
import scirpy as ir

For consistency, the scirpy API tries to follow the scanpy API as closely as possible.

Input/Output: io

The following functions allow to import V(D)J information from various formats.

read_h5ad(filename[, backed, as_sparse, …])

Read .h5ad-formatted hdf5 file.

read_10x_vdj(path[, filtered])

Read IR data from 10x Genomics cell-ranger output.

read_tracer(path)

Read data from TraCeR ([SLonnbergP+16]).

read_bracer(path)

Read data from BraCeR ([LEM+18]).

read_airr(path)

Read AIRR-compliant data.

To convert own formats into the scirpy Data structure, we recommend building a list of IrCell objects first, and then converting them into an AnnData object using from_ir_objs(). For more details, check the Data loading tutorial.

IrCell(cell_id, *[, multi_chain])

Data structure for a Cell with immune receptors.

IrChain(locus, *[, cdr3, cdr3_nt, expr, …])

Data structure for an immune cell receptor chain.

from_ir_objs(ir_objs)

Convert a collection of IrCell objects to an AnnData.

to_ir_objs(adata)

Convert an adata object with IR information back to a list of IrCells.

Preprocessing: pp

merge_with_ir(adata, adata_ir[, on])

Merge adaptive immune receptor (IR) data with transcriptomics data into a single AnnData object.

ir_neighbors(adata, *[, metric, cutoff, …])

Construct a neighborhood graph based on CDR3 sequence similarity.

Tools: tl

Tools add an interpretable annotation to the AnnData object which usually can be visualized by a corresponding plotting function.

Generic

group_abundance(adata, groupby[, …])

Summarizes the number/fraction of cells of a certain category by a certain group.

Quality control

chain_qc(adata, *[, inplace, key_added])

Perform quality control based on the receptor-chain pairing configuration.

Define and visualize clonotypes

define_clonotypes(adata, *[, key_added])

Define clonotypes based on CDR3 nucleic acid sequence identity.

define_clonotype_clusters(adata, *[, …])

Define clonotype clusters based on CDR3 distance.

clonotype_convergence(adata, *, key_coarse, …)

Finds evidence for Convergent evolution of clonotypes.

clonotype_network(adata, *[, sequence, …])

Layouts the clonotype network for plotting.

clonotype_network_igraph(adata[, basis])

Get an igraph object representing the clonotype network.

Analyse clonal diversity

clonal_expansion(adata, *[, target_col, …])

Adds a column to obs recording which clonotypes are expanded.

summarize_clonal_expansion(adata, groupby, *)

Summarizes clonal expansion by a grouping variable.

alpha_diversity(adata, groupby, *[, …])

Computes the alpha diversity of clonotypes within a group.

repertoire_overlap(adata, groupby, *[, …])

Compute distance between cell groups based on clonotype overlap.

clonotype_imbalance(adata, replicate_col, …)

Aims to find clonotypes that are the most enriched or depleted in a category.

V(D)J gene usage

spectratype(adata[, groupby, combine_fun, …])

Summarizes the distribution of CDR3 region lengths.

Plotting: pl

Generic

embedding(adata, basis, *[, color, …])

A customized wrapper to the scanpy.pl.embedding() function.

Tools

Every of these plotting functions has a corresponding tool in the scirpy.tl section. Depending on the computational load, tools are either invoked on-the-fly when calling the plotting function or need to be precomputed and stored in AnnData previously.

alpha_diversity(adata, groupby, *[, …])

Plot the alpha diversity per group.
`

clonal_expansion(adata, groupby, *[, …])

Visualize clonal expansion.
`

group_abundance(adata, groupby[, …])

Plots the number of cells per group, split up by a categorical variable.
`

spectratype(adata[, cdr3_col, combine_fun, …])

Show the distribution of CDR3 region lengths.
`

vdj_usage(adata, *[, vdj_cols, …])

Creates a ribbon plot of the most abundant VDJ combinations.
`

repertoire_overlap(adata, groupby, *[, …])

Visualizes overlap betwen a pair of samples on a scatter plot or
`

clonotype_imbalance(adata, replicate_col, …)

Aims to find clonotypes that are the most enriched or depleted in a category.

clonotype_network(adata, *[, color, …])

Plot the Clonotype network.
`

Base plotting functions: pl.base

bar(data, *[, ax, stacked, style, …])

Basic plotting function built on top of bar plot in Pandas.

line(data, *[, ax, style, style_kws, fig_kws])

Basic plotting function built on top of line plot in Pandas.

barh(data, *[, ax, style, style_kws, fig_kws])

Basic plotting function built on top of bar plot in Pandas.

curve(data, *[, ax, curve_layout, shade, …])

Basic plotting function for drawing KDE-smoothed curves.

Plot styling: pl.styling

apply_style_to_axes(ax, style, style_kws)

Apply a predefined style to an axis object.

style_axes(ax[, title, legend_title, xlab, …])

Style an axes object.

Datasets: datasets

wu2020()

Return the dataset from [WMdA+20] as AnnData object.

wu2020_3k()

Return the dataset from [WMdA+20] as AnnData object, downsampled to 3000 TCR-containing cells.

maynard2020()

Return the dataset from [MMR+20] as AnnData object.

Utility functions: util

graph.layout_components(graph[, …])

Compute a graph layout by layouting all connected components individually.

IR distance metrics: ir_dist

sequence_dist(unique_seqs[, unique_seqs2, …])

Calculate a sequence x sequence distance matrix.

DistanceCalculator(cutoff)

Abstract base class for a CDR3-sequence distance calculator.

ParallelDistanceCalculator(cutoff, *[, …])

Abstract base class for a DistanceCalculator that computes distances in parallel.

IdentityDistanceCalculator([cutoff])

Calculates the Identity-distance between CDR3 sequences.

LevenshteinDistanceCalculator([cutoff])

Calculates the Levenshtein edit-distance between sequences.

HammingDistanceCalculator([cutoff])

Calculates the Hamming distance between sequences of identical length.

AlignmentDistanceCalculator([cutoff, …])

Calculates distance between sequences based on pairwise sequence alignment.