scirpy.tl.chain_qc

scirpy.tl.chain_qc(adata, *, airr_mod='airr', airr_key='airr', chain_idx_key='chain_indices', inplace=True, key_added=('receptor_type', 'receptor_subtype', 'chain_pairing'))

Perform quality control based on the receptor-chain pairing configuration.

Categorizes cells into their receptor types and according to their chain pairing status. The function adds three columns to adata.obs, two containing a coarse and fine annotation of receptor types, a third classifying cells according to the number of matched receptor types.

receptor_type can be one of the following
  • TCR (all cells that contain any combination of TRA/TRB/TRG/TRD chains, but no IGH/IGK/IGL chains)

  • BCR (all cells that contain any combination of IGH/IGK/IGL chains, but no TCR chains)

  • ambiguous (all cells that contain both BCR and TCR chains)

  • multichain (all cells with more than two VJ or more than two VDJ chains)

  • no IR (all cells without any detected immune receptor)

receptor_subtype can be one of the following
  • TRA+TRB (all cells that have only TRA and/or TRB chains)

  • TRG+TRD (all cells that have only TRG and/or TRD chains)

  • IGH (all cells that have only IGH chains, but no IGL or IGK)

  • IGH+IGL (all cells that have only IGH and IGL chains)

  • IGH+IGK (all cells that have only IGH and IGK chains)

  • multichain (all cells with more than two VJ or more than two VDJ chains)

  • ambiguous (all cells that are none of the above, e.g. TRA+TRD, TRA+IGH or, IGH+IGK as the primary and IGH+IGL as the secondary receptor)

  • no IR (all cells without any detected immune receptor)

chain_pairing can be one of the following
  • single pair (all cells that have exactely one matched VJ and VDJ chain)

  • orphan VJ (all cells that have only one VJ chain)

  • orphan VDJ (all cells that have only one VDJ chain)

  • extra VJ (all cells that have a matched pair of VJ and VDJ chains plus an additional VJ-chain)

  • extra VDJ (analogous)

  • two full chains (all cells that have two matched pairs of VJ and VDJ chains)

  • ambiguous (all cells that have unmatched chains, i.e. that have been classified as an ambiguous receptor_subtype)

  • multichain (all cells with more than two VJ or more than two VDJ chains)

  • no IR (all chains with not immune receptor chains)

Parameters
adata : AnnData | MuData | DataHandlerUnion[AnnData, MuData, DataHandler]

AnnData or MuData object that contains AIRR information.

inplace : bool (default: True)

If True, a column with the result will be stored in obs. Otherwise the result will be returned.

key_added : Sequence[str] (default: ('receptor_type', 'receptor_subtype', 'chain_pairing'))

Key under which the result will be stored in obs, if inplace is True. When the function is running on MuData, the result will be written to both mdata.obs["{airr_mod}:{key_added}"] and mdata.mod[airr_mod].obs[key_added].

airr_mod

Name of the modality with AIRR information is stored in the MuData object. if an AnnData object is passed to the function, this parameter is ignored.

airr_key

Key under which the AIRR information is stored in adata.obsm as an awkward array.

chain_idx_key

Key under which the chain indices are stored in adata.obsm. If chain indices are not present, index_chains() is run with default parameters.

Return type

None | Tuple[ndarray, ndarray, ndarray]Optional[Tuple[ndarray, ndarray, ndarray]]

Returns

Depending on the value of inplace either adds three columns to adata.obs or returns a tuple with three numpy arrays containing the annotations.