Usage principles¶

Import scirpy as

import scanpy as sc
import scirpy as ir

Workflow¶

Scirpy is an extension to Scanpy and adheres to its workflow principles:

The API is divided into preprocessing (pp), tools (tl), and plotting (pl).

All functions work on AnnData objects.

The AnnData instance is modified inplace, unless the functions is called with the keyword argument inplace=False.

We decided to handle a few minor points differenlty to Scanpy:

Plotting functions with inexpensive computations (e.g. scirpy.pl.clonal_expansion()) call the corresponding tool (scirpy.tl.clonal_expansion()) on-the-fly and don’t store the results in the AnnData object.

All plotting functions, by default, return a Axes object, or a list of such.

Data structure¶

For instructions how to load data into scirpy, see Loading adaptive Immune Receptor (IR)-sequencing data with Scirpy.

Scirpy leverages the AnnData data structure which combines a gene expression matrix (.X), gene-level annotations (.var) and cell-level annotations (.obs) into a single object. AnnData forms the basis for the Scanpy analysis workflow for single-cell transcriptomics data.

Image by F. Alex Wolf.¶

Scirpy adds the following IR-related columns to AnnData.obs:

has_ir: True for all cells with an adaptive immune receptor

IR_VJ_1_<attr>/IR_VJ_2_<attr>: columns related to the primary and secondary VJ-chain of a receptor (TRA, TRG, IGK, or IGL)

IR_VDJ_1_<attr>/IR_VDJ_2_<attr>: columns related to the primary and secondary VDJ-chain of a receptor (TRB, TRD, or IGH)

Where <attr> is any of:

locus: The IGMT locus name of the chain (TRA, IGH, etc.)

c_gene, v_gene, d_gene, j_gene: The gene symbols of the respective genes

cdr3 and cdr3_nt: The amino acoid and nucleotide sequences of the CDR3 regions

junction_ins: The number of nucleotides inserted in the VD + DJ junctions or the VJ junction, respectively.