Usage principles¶

Import scirpy as

import scanpy as sc
import scirpy as ir

Workflow¶

Scirpy is an extension to Scanpy and adheres to its workflow principles:

The API is divided into preprocessing (pp), tools (tl), and plotting (pl).

All functions work on AnnData objects.

The AnnData instance is modified inplace, unless the functions is called with the keyword argument inplace=False.

We decided to handle a few minor points differently to Scanpy:

Plotting functions with inexpensive computations (e.g. scirpy.pl.clonal_expansion()) call the corresponding tool (scirpy.tl.clonal_expansion()) on-the-fly and don’t store the results in the AnnData object.

All plotting functions, by default, return a Axes object, or a list of such.

Data structure¶

For instructions how to load data into scirpy, see Loading adaptive Immune Receptor (IR)-sequencing data with Scirpy.

Scirpy leverages the AnnData data structure which combines a gene expression matrix (.X), gene-level annotations (.var) and cell-level annotations (.obs) into a single object. AnnData forms the basis for the Scanpy analysis workflow for single-cell transcriptomics data.

_images/anndata.svg — Image by F. Alex Wolf.¶

Scirpy adds the following IR-related columns to AnnData.obs:

IR_VJ_1_<attr>/IR_VJ_2_<attr>: columns related to the primary and secondary VJ-chain of a receptor (TRA, TRG, IGK, or IGL)

IR_VDJ_1_<attr>/IR_VDJ_2_<attr>: columns related to the primary and secondary VDJ-chain of a receptor (TRB, TRD, or IGH)

has_ir: True for all cells with an adaptive immune receptor

extra_chains: Contains non-productive chains (if not filtered out), and extra chains that do not fit into the 2 VJ + 2 VDJ chain model encoded as JSON. Scirpy does not use this information except for writing it back to AIRR format using scirpy.io.write_airr().

multi_chain: True for all cells with more than two productive VJ cells or two or more productive VDJ cells.

Where <attr> can be any field of the AIRR Rearrangement Schema. For Scirpy the following fields are relevant:

locus: The IGMT locus name of the chain (TRA, IGH, etc.)

c_call, v_call, d_call, j_call: The gene symbols of the respective genes

junction_aa and junction: The amino acid and nucleotide sequences of the CDR3 regions