Python arguments are equivalent to long-option arguments (
--arg), unless otherwise specified. Flags are True/False arguments in Python. The manual for any gget tool can be called from the command-line using the-h--helpflag.
gget opentargets 🎯
Fetch associated diseases or drugs from OpenTargets using Ensembl IDs.
Return format: JSON/CSV (command-line) or data frame (Python).
This module was written by Sam Wagenaar.
⚠️ Change as of gget v0.30.8 —
resource="expression"output is different. OpenTargets retired the oldtarget.expressionsfield, sogget opentargets -r expressionnow returns OpenTargets'baselineExpressiondata instead. The output columns changed: results are now per-biosample (tissue and/or cell type) summary statistics (median,min,q1,q3,max,unit) withtissueBiosample.*/celltypeBiosample.*identifiers anddatasourceId/datatypeId— replacing the old per-tissuetissue.*/rna.*(z-score) columns, which no longer exist upstream. A gene can have thousands of biosamples; use--filters(e.g.datasourceId,datatypeId) or--limitto narrow the result. See the baseline expression example below.
Positional argument
ens_id
Ensembl gene ID, e.g ENSG00000169194.
Optional arguments
-r --resource
Defines the type of information to return in the output. Default: 'diseases'.
Possible resources are:
| Resource | Return Value | Valid Filters | Sources |
|---|---|---|---|
diseases | Associated diseases, phenotypes & traits (EFO-mapped) with the overall association score (0–1) | None | Various:etc. |
drugs | Associated drugs | drug.drugTypedrug.maximumClinicalStage | ChEMBL |
tractability | Tractability data | None | Open Targets |
pharmacogenetics | Pharmacogenetic responses | datasourceIdpgxCategoryevidenceLevel | PharmGKB |
expression | Baseline expression per biosample (tissue/cell type) with summary statistics | tissueBiosample.biosampleIddatasourceIddatatypeId |
|
depmap | DepMap gene→disease-effect data. | tissueIddiseaseFromSource | DepMap Portal |
interactions | Protein⇄protein interactions | sourceDatabasetargetB.idtargetB.approvedSymbol |
-l --limit
Limit the number of results, e.g 10. Default: No limit.
Note: Not compatible with the tractability and depmap resources.
-o --out
Path to the JSON file the results will be saved in, e.g. path/to/directory/results.json. Default: Standard out.
Python: save=True will save the output in the current working directory.
--filters
Filter results by exact equality using returned OpenTargets column names. Pass multiple filters by repeating the flag, e.g. '--filter disease.id=EFO_0000274 --filter drug.id=CHEMBL1743081'. Nested fields use dot notation, matching the column names returned by the API.
Flags
-csv --csv
Command-line only. Returns the output in CSV format, instead of JSON format.
Python: Use json=True to return output in JSON format.
-q --quiet
Command-line only. Prevents progress information from being displayed.
Python: Use verbose=False to prevent progress information from being displayed.
-or --or
Command-line only. Filters are combined with OR logic. Default: AND logic.
wrap_text
Python only. wrap_text=True displays data frame with wrapped text for easy reading (default: False).
Examples
Get associated diseases for a specific gene:
gget opentargets ENSG00000169194 -r diseases -l 1
# Python
import gget
gget.opentargets('ENSG00000169194', resource='diseases', limit=1)
→ Returns the top disease associated with the gene ENSG00000169194.
| score | disease.id | disease.name | disease.description |
|---|---|---|---|
| 0.7279798021712002 | MONDO_0004980 | atopic eczema | A common chronic pruritic inflammatory skin disease with a strong ... |
Understanding the
scorecolumn and disease IDs (issue #168)
scoreis OpenTargets' overall target–disease association score — a single value between 0 and 1 that aggregates the evidence across all data types and data sources. It is not a per-data-source score. The OpenTargets website shows, in addition, a per-data-type breakdown (genetic associations, somatic mutations, known drugs, animal models, etc.) in its "Associations" view;gget opentargetscurrently returns only the aggregatedscore, so a single gget row corresponds to one whole row of that web table.disease.idis an EFO ontology ID. OpenTargets maps every associated trait to EFO, which imports terms from several ontologies. As a result the returned IDs are not exclusively diseases: they can be MONDO disease terms (MONDO_*), Human Phenotype Ontology phenotypes (HP_*), Orphanet rare diseases (Orphanet_*), or EFO measurements/traits (EFO_*, e.g. biomarker or blood-measurement traits).gget opentargetsreturns the associations exactly as OpenTargets reports them, without dropping non-disease terms. To keep only MONDO disease terms, filter the returned data frame, e.g.df[df["disease.id"].str.startswith("MONDO")].
Get associated drugs for a specific gene:
gget opentargets ENSG00000169194 -r drugs -l 2
# Python
import gget
gget.opentargets('ENSG00000169194', resource='drugs', limit=2)
→ Returns the top 2 drugs associated with the gene ENSG00000169194.
| drug.id | drug.name | drug.drugType | drug.mechanismsOfAction.rows | drug.description | drug.synonyms | drug.tradeNames | drug.maximumClinicalStage | drug.indications.rows |
|---|---|---|---|---|---|---|---|---|
| CHEMBL1743035 | LEBRIKIZUMAB | Antibody | Interleukin‑13 inhibitor | Antibody drug with a maximum clinical stage of Approval ... | ['Lebrikizumab', 'MILR-1444A', ...] | Ebglyss | APPROVAL | [{'id': 'MONDO_0004980', 'name': 'atopic eczema'}, {'id': 'MONDO_0004979', 'name': 'asthma'}, ...] |
| CHEMBL1742985 | ANRUKINZUMAB | Antibody | Interleukin‑13 inhibitor | Antibody drug with a maximum clinical stage of Phase 2 ... | ['Anrukinzumab', 'IMA-638'] | [] | PHASE2 | [{'id': 'MONDO_0005101', 'name': 'ulcerative colitis'}, ...] |
Note: associated diseases/indications are nested under drug.indications.rows (each a {'id', 'name'} dict); the drug.id values are ChEMBL identifiers.
Get tractability data for a specific gene:
gget opentargets ENSG00000169194 -r tractability
# Python
import gget
gget.opentargets('ENSG00000169194', resource='tractability')
→ Returns tractability data for the gene ENSG00000169194.
| modality | label | value |
|---|---|---|
| SM | Approved Drug | False |
| SM | High-Quality Pocket | True |
| AB | Approved Drug | True |
| AB | Advanced Clinical | False |
Note: modality is SM (small molecule), AB (antibody), PR (PROTAC), OC (other clinical), etc.; value is a boolean indicating whether the target meets that tractability bucket.
Get pharmacogenetic responses for a specific gene:
gget opentargets ENSG00000169194 -r pharmacogenetics -l 1
# Python
import gget
gget.opentargets('ENSG00000169194', resource='pharmacogenetics', limit=1)
→ Returns pharmacogenetic responses for the gene ENSG00000169194.
| variantId | genotypeId | genotype | drugs | phenotypeText | genotypeAnnotationText | pgxCategory | isDirectTarget | evidenceLevel | datasourceId | literature | variantFunctionalConsequence.id | variantFunctionalConsequence.label |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5_132657117_C_T | 5_132657117_C_C,T | CT | {'id': 'CHEMBL535', 'name': 'SUNITINIB'} | decreased severity of drug-induced toxicity | Patients with renal cell carcinoma and ... | toxicity | False | 3 | clinpgx | 26387812 | SO:0001631 | upstream_gene_variant |
Note: drugs is a {'id', 'name'} dict (or list of them); literature ids are Europe PMC identifiers.
Get baseline expression of a gene across biosamples (tissues / cell types):
gget opentargets ENSG00000169194 -r expression -l 2
# Python
import gget
gget.opentargets('ENSG00000169194', resource='expression', limit=2)
→ Returns baseline expression summary statistics for the gene ENSG00000169194 per biosample.
| median | min | q1 | q3 | max | unit | datasourceId | datatypeId | tissueBiosample.biosampleId | tissueBiosample.biosampleName | celltypeBiosample.biosampleId | celltypeBiosample.biosampleName |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0.066891 | 0 | 0.028268 | 0.142208 | 1.69407 | TPM | gtex | bulk rna-seq | UBERON_0000007 | pituitary gland | None | None |
| 0.0 | 0 | 0.0 | 0.0 | 0.0 | CPM(pseudobulk sum) | tabula_sapiens | scrna-seq | UBERON_0000016 | endocrine pancreas | CL_0000115 | endothelial cell |
Note: bulk sources (e.g. GTEx) populate only tissueBiosample.* (celltypeBiosample.* is None); single-cell sources (e.g. Tabula Sapiens) populate both.
Note (OpenTargets API change): OpenTargets retired the per-tissue
target.expressionsfield (it now returns nothing) and moved baseline expression to the paginatedtarget.baselineExpressionfield.gget opentargets -r expressionnow returns per-biosample expression summary statistics (median,min,q1,q3,max) from the current sources (e.g. GTEx bulk RNA-seq and single-cell datasets), withtissueBiosample/celltypeBiosampleidentifiers anddatasourceId/datatypeIdso results can be filtered. The returned columns therefore differ from earlier gget versions. A gene can have thousands of biosamples; OpenTargets returns at most 3000 per request, so for genes that exceed this a warning is logged and you should narrow the query with--filters(e.g.datasourceIdordatatypeId). Use--limitto fetch fewer rows.
Get DepMap gene-disease effect data for a specific gene:
gget opentargets ENSG00000169194 -r depmap
# Python
import gget
gget.opentargets('ENSG00000169194', resource='depmap')
→ Returns DepMap gene-disease effect data for the gene ENSG00000169194.
| tissueId | tissueName | cellLineName | expression | diseaseFromSource | depmapId | geneEffect |
|---|---|---|---|---|---|---|
| UBERON_0000977 | pleura | NCI-H2452 | 0.0223550275 | Pleural Mesothelioma | ACH-000092 | 0.0368397422 |
Get protein-protein interactions for a specific gene:
gget opentargets ENSG00000169194 -r interactions -l 2
# Python
import gget
gget.opentargets('ENSG00000169194', resource='interactions', limit=2)
→ Returns the top 2 protein-protein interactions for the gene ENSG00000169194.
| score | count | sourceDatabase | intA | intABiologicalRole | intB | intBBiologicalRole | targetA.id | targetA.approvedSymbol | speciesA.taxonId | targetB.id | targetB.approvedSymbol | speciesB.taxonId |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0.999 | 3 | string | ENSP00000304915 | unspecified role | ENSP00000360730 | unspecified role | ENSG00000169194 | IL13 | 134 | ENSG00000131724 | IL13RA1 | 134 |
| 0.999 | 3 | string | ENSP00000304915 | unspecified role | ENSP00000361004 | unspecified role | ENSG00000169194 | IL13 | 134 | ENSG00000123496 | IL13RA2 | 134 |
Get protein-protein interactions for a specific gene, filtered by column values:
Filters use the generic --filter key=value flag (CLI) / filters={...} argument (Python), where the keys are the exact returned column names. Multiple filters are combined with AND (exact equality). (The previous filter_mode/OR option and the -fpa/--filter_gene_b shortcuts were removed in v0.30.5.)
gget opentargets ENSG00000169194 -r interactions --filter sourceDatabase=string --filter targetB.approvedSymbol=IL13RA1
# Python
import gget
gget.opentargets('ENSG00000169194', resource='interactions', filters={'sourceDatabase': 'string', 'targetB.approvedSymbol': 'IL13RA1'})
→ Returns the string-sourced interactions of IL13 where the partner is IL13RA1.
| score | count | sourceDatabase | intA | intABiologicalRole | intB | intBBiologicalRole | targetA.id | targetA.approvedSymbol | speciesA.taxonId | targetB.id | targetB.approvedSymbol | speciesB.taxonId |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0.999 | 3 | string | ENSP00000304915 | unspecified role | ENSP00000360730 | unspecified role | ENSG00000169194 | IL13 | 134 | ENSG00000131724 | IL13RA1 | 134 |
More examples
References
If you use gget opentargets in a publication, please cite the following articles:
-
Luebbert, L., & Pachter, L. (2023). Efficient querying of genomic reference databases with gget. Bioinformatics. https://doi.org/10.1093/bioinformatics/btac836
-
Ochoa D, Hercules A, Carmona M, Suveges D, Baker J, Malangone C, Lopez I, Miranda A, Cruz-Castillo C, Fumis L, Bernal-Llinares M, Tsukanov K, Cornu H, Tsirigos K, Razuvayevskaya O, Buniello A, Schwartzentruber J, Karim M, Ariano B, Martinez Osorio RE, Ferrer J, Ge X, Machlitt-Northen S, Gonzalez-Uriarte A, Saha S, Tirunagari S, Mehta C, Roldán-Romero JM, Horswell S, Young S, Ghoussaini M, Hulcoop DG, Dunham I, McDonagh EM. The next-generation Open Targets Platform: reimagined, redesigned, rebuilt. Nucleic Acids Res. 2023 Jan 6;51(D1):D1353-D1359. doi: 10.1093/nar/gkac1046. PMID: 36399499; PMCID: PMC9825572.