Usage principles

Import scirpy as

import scanpy as sc
import scirpy as ir

Workflow

Scirpy is an extension to Scanpy and adheres to its workflow principles:

  • The API is divided into preprocessing (pp), tools (tl), and plotting (pl).

  • All functions work on AnnData objects.

  • The AnnData instance is modified inplace, unless the functions is called with the keyword argument inplace=False.

We decided to handle a few minor points differenlty to Scanpy:

Data structure

For instructions how to load data into scirpy, see Loading adaptive Immune Receptor (IR)-sequencing data with Scirpy.

Scirpy leverages the AnnData data structure which combines a gene expression matrix (.X), gene-level annotations (.var) and cell-level annotations (.obs) into a single object. AnnData forms the basis for the Scanpy analysis workflow for single-cell transcriptomics data.

_images/anndata.svg

Image by F. Alex Wolf.

Scirpy adds the following IR-related columns to AnnData.obs:

  • has_ir: True for all cells with an adaptive immune receptor

  • IR_VJ_1_<attr>/IR_VJ_2_<attr>: columns related to the primary and secondary VJ-chain of a receptor (TRA, TRG, IGK, or IGL)

  • IR_VDJ_1_<attr>/IR_VDJ_2_<attr>: columns related to the primary and secondary VDJ-chain of a receptor (TRB, TRD, or IGH)

Where <attr> is any of:

  • locus: The IGMT locus name of the chain (TRA, IGH, etc.)

  • c_gene, v_gene, d_gene, j_gene: The gene symbols of the respective genes

  • cdr3 and cdr3_nt: The amino acoid and nucleotide sequences of the CDR3 regions

  • junction_ins: The number of nucleotides inserted in the VD + DJ junctions or the VJ junction, respectively.