scirpy.tl.ir_query_annotate_df
- scirpy.tl.ir_query_annotate_df(adata, reference, *, sequence='aa', metric='identity', include_ref_cols=None, include_query_cols=(), query_key=None, suffix='', airr_mod='airr', airr_mod_ref='airr')
Returns the inner join of
adata.obs
with matching entries fromreference.obs
based on the result ofir_query()
.Warning
This is an experimental function that may change in the future.
The function first creates a two-column dataframe mapping cell indices of
adata
to cell indices ofreference
. It then performs an inner join withreference.obs
, and finally performs another join withquery.obs
.This function requires that
~scirpy.tl.ir_query
has been executed onadata
with the same reference and the same parameters forsequence
andmetric
.This function returns all matching entries in the reference database, which can be none for some cells, but many for others. If you want to add a single column to
adata.obs
for plotting, please refer to~scirpy.tl.ir_query_annotate
.- Parameters
- adata :
AnnData
|MuData
|DataHandler
Union
[AnnData
,MuData
,DataHandler
] query dataset
- reference :
AnnData
|MuData
|DataHandler
Union
[AnnData
,MuData
,DataHandler
] reference dataset
- sequence : {‘aa’, ‘nt’}
Literal
[‘aa’, ‘nt’] (default:'aa'
) The sequence parameter used when running
scirpy.pp.ir_dist()
- metric : {‘alignment’, ‘identity’, ‘levenshtein’, ‘hamming’} |
DistanceCalculator
Union
[Literal
[‘alignment’, ‘identity’, ‘levenshtein’, ‘hamming’],DistanceCalculator
] (default:'identity'
) The metric parameter used when running
scirpy.pp.ir_dist()
- include_ref_cols :
Sequence
[str
] |None
Optional
[Sequence
[str
]] (default:None
) Subset the reference database to these columns. Default: include all.
- include_query_cols :
Sequence
[str
] (default:()
) Subset
adata.obs
to these columns. Default: include all.- query_key :
str
|None
Optional
[str
] (default:None
) Use the distance matric stored under this key in
adata.uns
. If set to None, the key is automatically inferred based onreference
,sequence
, andmetric
. Additional arguments are passed to the last join.- suffix :
str
(default:''
) Suffix appended to columns from
reference.obs
in case their names are conflicting with those inadata.obs
.- airr_mod :
str
(default:'airr'
) Name of the modality with AIRR information is stored in the
MuData
object. if anAnnData
object is passed to the function, this parameter is ignored.- airr_mod_ref :
str
(default:'airr'
) Like
airr_mod
, but forreference
.
- adata :
- Return type
- Returns
DataFrame with matching entries from
reference.obs
.