Usage principles¶
Import scirpy as
import scanpy as sc
import scirpy as ir
Workflow¶
Scirpy is an extension to Scanpy and adheres to its workflow principles:
We decided to handle a few minor points differently to Scanpy:
Plotting functions with inexpensive computations (e.g.
scirpy.pl.clonal_expansion()
) call the corresponding tool (scirpy.tl.clonal_expansion()
) on-the-fly and don’t store the results in theAnnData
object.All plotting functions, by default, return a
Axes
object, or a list of such.
Data structure¶
For instructions how to load data into scirpy, see Loading adaptive Immune Receptor (IR)-sequencing data with Scirpy.
Scirpy leverages the AnnData data structure
which combines a gene expression matrix (.X
), gene-level annotations (.var
) and
cell-level annotations (.obs
) into a single object. AnnData
forms the basis for the
Scanpy analysis workflow
for single-cell transcriptomics data.
Image by F. Alex Wolf.¶
Scirpy adds the following IR-related columns to AnnData.obs
:
IR_VJ_1_<attr>
/IR_VJ_2_<attr>
: columns related to the primary and secondary VJ-chain of a receptor (TRA
,TRG
,IGK
, orIGL
)
IR_VDJ_1_<attr>
/IR_VDJ_2_<attr>
: columns related to the primary and secondary VDJ-chain of a receptor (TRB
,TRD
, orIGH
)
has_ir
:True
for all cells with an adaptive immune receptor
extra_chains
: Contains non-productive chains (if not filtered out), and extra chains that do not fit into the 2VJ
+ 2VDJ
chain model encoded as JSON. Scirpy does not use this information except for writing it back to AIRR format usingscirpy.io.write_airr()
.
multi_chain
:True
for all cells with more than two productiveVJ
cells or two or more productiveVDJ
cells.
Where <attr>
can be any field of the AIRR Rearrangement Schema.
For Scirpy the following fields are relevant:
locus
: The IGMT locus name of the chain (TRA
,IGH
, etc.)
c_call
,v_call
,d_call
,j_call
: The gene symbols of the respective genes
junction_aa
andjunction
: The amino acid and nucleotide sequences of the CDR3 regions