scirpy.io.read_airr
- scirpy.io.read_airr(path, use_umi_count_col='auto', infer_locus=True, cell_attributes=('is_cell', 'high_confidence'), include_fields=None, **kwargs)
Read data from AIRR rearrangement format.
Even though data without these fields can be imported, the following columns are required by scirpy for a meaningful analysis:
cell_idproductivelocuscontaining a valid IMGT locus nameat least one of
consensus_count,duplicate_count, orumi_countat least one of
junction_aaorjunction.
Note
Since scirpy v0.13, there are no restrictions on the AIRR data that can be stored in the scirpy data structure, except that each receptor chain needs to be associated with a cell.
The scirpy Immune receptor (IR) model is now applied in later step using the
index_chains()function.For more information, see Storing AIRR rearrangement data in AnnData.
- Parameters
- path :
str|Sequence[str] |Path|Sequence[Path] |DataFrame|Sequence[DataFrame]Union[str,Sequence[str],Path,Sequence[Path],DataFrame,Sequence[DataFrame]] Path to the AIRR rearrangement tsv file. If different chains are split up into multiple files, these can be specified as a List, e.g.
["path/to/tcr_alpha.tsv", "path/to/tcr_beta.tsv"]. Alternatively, this can be a pandas data frame.- use_umi_count_col :
bool| {‘auto’}Union[bool,Literal[‘auto’]] (default:'auto') Whether to add UMI counts from the non-strandard (but common)
umi_countcolumn. When this column is used, the UMI counts are moved over to the standardduplicate_countcolumn. Default: Useumi_countif there is noduplicate_countcolumn present.- infer_locus :
bool(default:True) Try to infer the
locuscolumn from gene names, in case it is not specified.- cell_attributes :
Collection[str] (default:('is_cell', 'high_confidence')) Fields in the rearrangement schema that are specific for a cell rather than a chain. The values must be identical over all records belonging to a cell. This defaults to
("is_cell","high_confidence").- include_fields :
Any|NoneOptional[Any] (default:None) Deprecated. Does not have any effect as of v0.13.
- **kwargs
are passed to
from_airr_cells().
- path :
- Return type
- Returns
AnnData object with AIRR data in
obsm["airr"]for each cell. For more details see Storing AIRR rearrangement data in AnnData..