View page source on GitHub

Python arguments are equivalent to long-option arguments (--arg), unless otherwise specified. Flags are True/False arguments in Python. The manual for any gget tool can be called from the command-line using the -h --help flag.

gget opentargets 🎯

Fetch associated diseases or drugs from OpenTargets using Ensembl IDs.
Return format: JSON/CSV (command-line) or data frame (Python).

This module was written by Sam Wagenaar.

⚠️ Change as of gget v0.30.8 — resource="expression" output is different. OpenTargets retired the old target.expressions field, so gget opentargets -r expression now returns OpenTargets' baselineExpression data instead. The output columns changed: results are now per-biosample (tissue and/or cell type) summary statistics (median, min, q1, q3, max, unit) with tissueBiosample.* / celltypeBiosample.* identifiers and datasourceId / datatypeId — replacing the old per-tissue tissue.* / rna.* (z-score) columns, which no longer exist upstream. A gene can have thousands of biosamples; use --filters (e.g. datasourceId, datatypeId) or --limit to narrow the result. See the baseline expression example below.

Positional argument
ens_id
Ensembl gene ID, e.g ENSG00000169194.

Optional arguments
-r --resource
Defines the type of information to return in the output. Default: 'diseases'.
Possible resources are:

ResourceReturn ValueValid FiltersSources
diseasesAssociated diseases, phenotypes & traits (EFO-mapped) with the overall association score (0–1)NoneVarious:etc.
drugsAssociated drugsdrug.drugType
drug.maximumClinicalStage
ChEMBL
tractabilityTractability dataNoneOpen Targets
pharmacogeneticsPharmacogenetic responsesdatasourceId
pgxCategory
evidenceLevel
PharmGKB
expressionBaseline expression per biosample (tissue/cell type) with summary statisticstissueBiosample.biosampleId
datasourceId
datatypeId
depmapDepMap gene→disease-effect data.tissueId
diseaseFromSource
DepMap Portal
interactionsProtein⇄protein interactionssourceDatabase
targetB.id
targetB.approvedSymbol

-l --limit
Limit the number of results, e.g 10. Default: No limit.
Note: Not compatible with the tractability and depmap resources.

-o --out
Path to the JSON file the results will be saved in, e.g. path/to/directory/results.json. Default: Standard out.
Python: save=True will save the output in the current working directory.

--filters
Filter results by exact equality using returned OpenTargets column names. Pass multiple filters by repeating the flag, e.g. '--filter disease.id=EFO_0000274 --filter drug.id=CHEMBL1743081'. Nested fields use dot notation, matching the column names returned by the API.

Flags
-csv --csv
Command-line only. Returns the output in CSV format, instead of JSON format. Python: Use json=True to return output in JSON format.

-q --quiet
Command-line only. Prevents progress information from being displayed.
Python: Use verbose=False to prevent progress information from being displayed.

-or --or
Command-line only. Filters are combined with OR logic. Default: AND logic.

wrap_text
Python only. wrap_text=True displays data frame with wrapped text for easy reading (default: False).

Examples

Get associated diseases for a specific gene:

gget opentargets ENSG00000169194 -r diseases -l 1
# Python
import gget
gget.opentargets('ENSG00000169194', resource='diseases', limit=1)

→ Returns the top disease associated with the gene ENSG00000169194.

scoredisease.iddisease.namedisease.description
0.7279798021712002MONDO_0004980atopic eczemaA common chronic pruritic inflammatory skin disease with a strong ...

Understanding the score column and disease IDs (issue #168)

  • score is OpenTargets' overall target–disease association score — a single value between 0 and 1 that aggregates the evidence across all data types and data sources. It is not a per-data-source score. The OpenTargets website shows, in addition, a per-data-type breakdown (genetic associations, somatic mutations, known drugs, animal models, etc.) in its "Associations" view; gget opentargets currently returns only the aggregated score, so a single gget row corresponds to one whole row of that web table.
  • disease.id is an EFO ontology ID. OpenTargets maps every associated trait to EFO, which imports terms from several ontologies. As a result the returned IDs are not exclusively diseases: they can be MONDO disease terms (MONDO_*), Human Phenotype Ontology phenotypes (HP_*), Orphanet rare diseases (Orphanet_*), or EFO measurements/traits (EFO_*, e.g. biomarker or blood-measurement traits). gget opentargets returns the associations exactly as OpenTargets reports them, without dropping non-disease terms. To keep only MONDO disease terms, filter the returned data frame, e.g. df[df["disease.id"].str.startswith("MONDO")].



Get associated drugs for a specific gene:

gget opentargets ENSG00000169194 -r drugs -l 2
# Python
import gget
gget.opentargets('ENSG00000169194', resource='drugs', limit=2)

→ Returns the top 2 drugs associated with the gene ENSG00000169194.

drug.iddrug.namedrug.drugTypedrug.mechanismsOfAction.rowsdrug.descriptiondrug.synonymsdrug.tradeNamesdrug.maximumClinicalStagedrug.indications.rows
CHEMBL1743035LEBRIKIZUMABAntibodyInterleukin‑13 inhibitorAntibody drug with a maximum clinical stage of Approval ...['Lebrikizumab', 'MILR-1444A', ...]EbglyssAPPROVAL[{'id': 'MONDO_0004980', 'name': 'atopic eczema'}, {'id': 'MONDO_0004979', 'name': 'asthma'}, ...]
CHEMBL1742985ANRUKINZUMABAntibodyInterleukin‑13 inhibitorAntibody drug with a maximum clinical stage of Phase 2 ...['Anrukinzumab', 'IMA-638'][]PHASE2[{'id': 'MONDO_0005101', 'name': 'ulcerative colitis'}, ...]

Note: associated diseases/indications are nested under drug.indications.rows (each a {'id', 'name'} dict); the drug.id values are ChEMBL identifiers.



Get tractability data for a specific gene:

gget opentargets ENSG00000169194 -r tractability
# Python
import gget
gget.opentargets('ENSG00000169194', resource='tractability')

→ Returns tractability data for the gene ENSG00000169194.

modalitylabelvalue
SMApproved DrugFalse
SMHigh-Quality PocketTrue
ABApproved DrugTrue
ABAdvanced ClinicalFalse

Note: modality is SM (small molecule), AB (antibody), PR (PROTAC), OC (other clinical), etc.; value is a boolean indicating whether the target meets that tractability bucket.



Get pharmacogenetic responses for a specific gene:

gget opentargets ENSG00000169194 -r pharmacogenetics -l 1
# Python
import gget
gget.opentargets('ENSG00000169194', resource='pharmacogenetics', limit=1)

→ Returns pharmacogenetic responses for the gene ENSG00000169194.

variantIdgenotypeIdgenotypedrugsphenotypeTextgenotypeAnnotationTextpgxCategoryisDirectTargetevidenceLeveldatasourceIdliteraturevariantFunctionalConsequence.idvariantFunctionalConsequence.label
5_132657117_C_T5_132657117_C_C,TCT{'id': 'CHEMBL535', 'name': 'SUNITINIB'}decreased severity of drug-induced toxicityPatients with renal cell carcinoma and ...toxicityFalse3clinpgx26387812SO:0001631upstream_gene_variant

Note: drugs is a {'id', 'name'} dict (or list of them); literature ids are Europe PMC identifiers.



Get baseline expression of a gene across biosamples (tissues / cell types):

gget opentargets ENSG00000169194 -r expression -l 2
# Python
import gget
gget.opentargets('ENSG00000169194', resource='expression', limit=2)

→ Returns baseline expression summary statistics for the gene ENSG00000169194 per biosample.

medianminq1q3maxunitdatasourceIddatatypeIdtissueBiosample.biosampleIdtissueBiosample.biosampleNamecelltypeBiosample.biosampleIdcelltypeBiosample.biosampleName
0.06689100.0282680.1422081.69407TPMgtexbulk rna-seqUBERON_0000007pituitary glandNoneNone
0.000.00.00.0CPM(pseudobulk sum)tabula_sapiensscrna-seqUBERON_0000016endocrine pancreasCL_0000115endothelial cell

Note: bulk sources (e.g. GTEx) populate only tissueBiosample.* (celltypeBiosample.* is None); single-cell sources (e.g. Tabula Sapiens) populate both.

Note (OpenTargets API change): OpenTargets retired the per-tissue target.expressions field (it now returns nothing) and moved baseline expression to the paginated target.baselineExpression field. gget opentargets -r expression now returns per-biosample expression summary statistics (median, min, q1, q3, max) from the current sources (e.g. GTEx bulk RNA-seq and single-cell datasets), with tissueBiosample/celltypeBiosample identifiers and datasourceId/datatypeId so results can be filtered. The returned columns therefore differ from earlier gget versions. A gene can have thousands of biosamples; OpenTargets returns at most 3000 per request, so for genes that exceed this a warning is logged and you should narrow the query with --filters (e.g. datasourceId or datatypeId). Use --limit to fetch fewer rows.



Get DepMap gene-disease effect data for a specific gene:

gget opentargets ENSG00000169194 -r depmap
# Python
import gget
gget.opentargets('ENSG00000169194', resource='depmap')

→ Returns DepMap gene-disease effect data for the gene ENSG00000169194.

tissueIdtissueNamecellLineNameexpressiondiseaseFromSourcedepmapIdgeneEffect
UBERON_0000977pleuraNCI-H24520.0223550275Pleural MesotheliomaACH-0000920.0368397422



Get protein-protein interactions for a specific gene:

gget opentargets ENSG00000169194 -r interactions -l 2
# Python
import gget
gget.opentargets('ENSG00000169194', resource='interactions', limit=2)

→ Returns the top 2 protein-protein interactions for the gene ENSG00000169194.

scorecountsourceDatabaseintAintABiologicalRoleintBintBBiologicalRoletargetA.idtargetA.approvedSymbolspeciesA.taxonIdtargetB.idtargetB.approvedSymbolspeciesB.taxonId
0.9993stringENSP00000304915unspecified roleENSP00000360730unspecified roleENSG00000169194IL13134ENSG00000131724IL13RA1134
0.9993stringENSP00000304915unspecified roleENSP00000361004unspecified roleENSG00000169194IL13134ENSG00000123496IL13RA2134



Get protein-protein interactions for a specific gene, filtered by column values:

Filters use the generic --filter key=value flag (CLI) / filters={...} argument (Python), where the keys are the exact returned column names. Multiple filters are combined with AND (exact equality). (The previous filter_mode/OR option and the -fpa/--filter_gene_b shortcuts were removed in v0.30.5.)

gget opentargets ENSG00000169194 -r interactions --filter sourceDatabase=string --filter targetB.approvedSymbol=IL13RA1
# Python
import gget
gget.opentargets('ENSG00000169194', resource='interactions', filters={'sourceDatabase': 'string', 'targetB.approvedSymbol': 'IL13RA1'})

→ Returns the string-sourced interactions of IL13 where the partner is IL13RA1.

scorecountsourceDatabaseintAintABiologicalRoleintBintBBiologicalRoletargetA.idtargetA.approvedSymbolspeciesA.taxonIdtargetB.idtargetB.approvedSymbolspeciesB.taxonId
0.9993stringENSP00000304915unspecified roleENSP00000360730unspecified roleENSG00000169194IL13134ENSG00000131724IL13RA1134

More examples

References

If you use gget opentargets in a publication, please cite the following articles:

  • Luebbert, L., & Pachter, L. (2023). Efficient querying of genomic reference databases with gget. Bioinformatics. https://doi.org/10.1093/bioinformatics/btac836

  • Ochoa D, Hercules A, Carmona M, Suveges D, Baker J, Malangone C, Lopez I, Miranda A, Cruz-Castillo C, Fumis L, Bernal-Llinares M, Tsukanov K, Cornu H, Tsirigos K, Razuvayevskaya O, Buniello A, Schwartzentruber J, Karim M, Ariano B, Martinez Osorio RE, Ferrer J, Ge X, Machlitt-Northen S, Gonzalez-Uriarte A, Saha S, Tirunagari S, Mehta C, Roldán-Romero JM, Horswell S, Young S, Ghoussaini M, Hulcoop DG, Dunham I, McDonagh EM. The next-generation Open Targets Platform: reimagined, redesigned, rebuilt. Nucleic Acids Res. 2023 Jan 6;51(D1):D1353-D1359. doi: 10.1093/nar/gkac1046. PMID: 36399499; PMCID: PMC9825572.