snapatac2.pp.scrublet#
- snapatac2.pp.scrublet(adata, features='selected', n_comps=15, sim_doublet_ratio=2.0, expected_doublet_rate=0.1, n_neighbors=None, use_approx_neighbors=True, random_state=0, inplace=True, n_jobs=8, verbose=True)[source]#
Compute probability of being a doublet using the scrublet algorithm.
- Parameters:
adata (
AnnData|list[AnnData]) – The (annotated) data matrix of shapen_obsxn_vars. Rows correspond to cells and columns to regions.adatacan also be a list of AnnData objects. In this case, the function will be applied to each AnnData object in parallel.features (
UnionType[str,ndarray,None]) – Boolean index mask, whereTruemeans that the feature is kept, andFalsemeans the feature is removed.n_comps (
int) – Number of PCssim_doublet_ratio (
float) – Number of doublets to simulate relative to the number of observed cells.expected_doublet_rate (
float) – Expected doublet rate.n_neighbors (
Optional[int]) – Number of neighbors used to construct the KNN graph of observed cells and simulated doublets. IfNone, this is set to round(0.5 * sqrt(n_cells))use_approx_neighbors – Whether to use approximate search.
random_state (
int) – Random state.inplace (
bool) – Whether update the AnnData object inplacen_jobs (
int) – Number of jobs to run in parallel.verbose (
bool) – Whether to print progress messages.
- Returns:
- if
inplace = True, it updates adata with the following fields: adata.obs["doublet_probability"]: probability of being a doubletadata.obs["doublet_score"]: doublet score
- if
- Return type:
tuple[np.ndarray, np.ndarray] | None