Usage principles
Import scirpy as
import scanpy as sc
import scirpy as ir
Workflow
Scirpy is an extension to Scanpy and adheres to its workflow principles:
We decided to handle a few minor points differently to Scanpy:
Plotting functions with inexpensive computations (e.g.
scirpy.pl.clonal_expansion()) call the corresponding tool (scirpy.tl.clonal_expansion()) on-the-fly and don’t store the results in theAnnDataobject.All plotting functions, by default, return a
Axesobject, or a list of such.
Data structure
For instructions how to load data into scirpy, see Loading adaptive Immune Receptor (IR)-sequencing data with Scirpy.
Scirpy leverages the AnnData data structure
which combines a gene expression matrix (.X), gene-level annotations (.var) and
cell-level annotations (.obs) into a single object. AnnData forms the basis for the
Scanpy analysis workflow
for single-cell transcriptomics data.
Image by F. Alex Wolf.
Scirpy adds the following IR-related columns to AnnData.obs:
IR_VJ_1_<attr>/IR_VJ_2_<attr>: columns related to the primary and secondary VJ-chain of a receptor (TRA,TRG,IGK, orIGL)
IR_VDJ_1_<attr>/IR_VDJ_2_<attr>: columns related to the primary and secondary VDJ-chain of a receptor (TRB,TRD, orIGH)
has_ir:Truefor all cells with an adaptive immune receptor
extra_chains: Contains non-productive chains (if not filtered out), and extra chains that do not fit into the 2VJ+ 2VDJchain model encoded as JSON. Scirpy does not use this information except for writing it back to AIRR format usingscirpy.io.write_airr().
multi_chain:Truefor all cells with more than two productiveVJcells or two or more productiveVDJcells.
Where <attr> can be any field of the AIRR Rearrangement Schema.
For Scirpy the following fields are relevant:
locus: The IMGT locus name of the chain (TRA,IGH, etc.)
c_call,v_call,d_call,j_call: The gene symbols of the respective genes
junction_aaandjunction: The amino acid and nucleotide sequences of the CDR3 regions