CiliaHub Data API

All CiliaHub data is freely accessible as static JSON files served over HTTPS — no authentication, no rate limits, no API keys. Load an endpoint directly in any language.

No auth No rate limits Static JSON over HTTPS CC-BY
Citation: If you use this data in published work, please cite the CiliaHub catalogue paper (manuscript in preparation; preprint pending).

Phenotype data endpoints

Eight files power the phenotype matcher. They are designed to be loaded together for symptom-to-gene matching, but each one is also independently useful.

EndpointWhat it contains
phenotype_meta.jsonSummary counts (records, diseases, HPO terms, concepts, organs) and source-file references.
disease_phenotype_profiles.jsonPer-disease: classification, HPO IDs, organ-system counts, concept counts, associated genes, search aliases.
phenotype_index.jsonInverted HPO-ID index: each HPO term → diseases that exhibit it, parent concept, organ, information content (IC).
phenotype_concepts.jsonCanonical-concept → HPO-ID list. The 275 canonical leaf concepts in our hierarchy.
phenotype_organs.jsonPer-organ-system: diseases involved, concept count.
concept_aliases.jsonConcept → list of free-text phrase patterns that match it (used for autocomplete fuzzy matching).
gene_to_diseases.jsonInverted gene → diseases index for gene-name search.
class_index.jsonUI classification rollup: Primary / Motile / Secondary / Unclassified → disease lists.
disease_slugs.jsonDisease name → URL slug for /disease/<slug> permalinks.

Gene catalogue endpoint

ciliahub_master_merged.jsonFull gene catalogue: symbol, Ensembl, OMIM, UniProt, ciliopathy associations, mouse phenotype, mechanism notes, evidence PMIDs, localisation refs, expression data.

🥇 Gold Standard Ciliary Genes

Every gene in CiliaHub is a curated ciliary gene. The Gold Standard Ciliary Genes are those with bona-fide ciliary localisation or function, marked in the gene catalogue by the field evidence_tier == "Gold Standard Ciliary Genes" (the remainder are tagged "Cilia-Associated Genes"). The list is open and free to all — no key required.

Browser downloadUse the Gold Standard Ciliary Genes CSV / JSON buttons on the home page. The export is a wide, per-gene table covering identity, ciliopathies, localisation, perturbation, orthologs, domains, and reference fields.
ProgrammaticFilter the gene catalogue ciliahub_master_merged.json on evidence_tier (see snippets below).

Example: extract the Gold Standard list (Python)

import urllib.request, json

with urllib.request.urlopen('https://ciliahub.org/data/genes/ciliahub_master_merged.json') as r:
    master = json.load(r)

genes = master['genes'] if isinstance(master, dict) and 'genes' in master else master
items  = genes.values() if isinstance(genes, dict) else genes
gold   = [g for g in items if g.get('evidence_tier') == 'Gold Standard Ciliary Genes']

print(len(gold), 'Gold Standard ciliary genes')
print([g.get('Gene') or g.get('gene') for g in gold[:10]])

Example: extract the Gold Standard list (curl + jq)

# Gold Standard gene symbols, one per line
curl -s https://ciliahub.org/data/genes/ciliahub_master_merged.json \
  | jq -r '(.genes // .) | .[]
           | select(.evidence_tier=="Gold Standard Ciliary Genes") | (.Gene // .gene)'

Permalinks

Example: Python

import urllib.request, json

# Load the disease profiles
with urllib.request.urlopen('https://ciliahub.org/data/phenotype/disease_phenotype_profiles.json') as r:
    profiles = json.load(r)

# Find diseases linked to HSD17B4
with urllib.request.urlopen('https://ciliahub.org/data/phenotype/gene_to_diseases.json') as r:
    g2d = json.load(r)
print(g2d['HSD17B4'])
# {'diseases': ['D-Bifunctional Protein Deficiency'], 'n_diseases': 1}

# Get all "Primary" class diseases
with urllib.request.urlopen('https://ciliahub.org/data/phenotype/class_index.json') as r:
    classes = json.load(r)
print(len(classes['Primary']), 'Primary diseases')

Example: curl + jq

# List all genes in the Bardet-Biedl Syndrome profile
curl -s https://ciliahub.org/data/phenotype/disease_phenotype_profiles.json \
  | jq -r '."Bardet-Biedl Syndrome".genes[]'

# Count Primary ciliopathies
curl -s https://ciliahub.org/data/phenotype/class_index.json | jq '.Primary | length'

# All diseases linked to CEP290
curl -s https://ciliahub.org/data/phenotype/gene_to_diseases.json | jq '.CEP290'

Data freshness

The phenotype data is regenerated from Supplementary Tables S1 (gene catalogue), S2 (disease catalogue), and S5 (symptom classification). The phenotype_meta.json file records the source spreadsheet version.