All CiliaHub data is freely accessible as static JSON files served over HTTPS — no authentication, no rate limits, no API keys. Load an endpoint directly in any language.
Eight files power the phenotype matcher. They are designed to be loaded together for symptom-to-gene matching, but each one is also independently useful.
| Endpoint | What it contains |
|---|---|
phenotype_meta.json | Summary counts (records, diseases, HPO terms, concepts, organs) and source-file references. |
disease_phenotype_profiles.json | Per-disease: classification, HPO IDs, organ-system counts, concept counts, associated genes, search aliases. |
phenotype_index.json | Inverted HPO-ID index: each HPO term → diseases that exhibit it, parent concept, organ, information content (IC). |
phenotype_concepts.json | Canonical-concept → HPO-ID list. The 275 canonical leaf concepts in our hierarchy. |
phenotype_organs.json | Per-organ-system: diseases involved, concept count. |
concept_aliases.json | Concept → list of free-text phrase patterns that match it (used for autocomplete fuzzy matching). |
gene_to_diseases.json | Inverted gene → diseases index for gene-name search. |
class_index.json | UI classification rollup: Primary / Motile / Secondary / Unclassified → disease lists. |
disease_slugs.json | Disease name → URL slug for /disease/<slug> permalinks. |
ciliahub_master_merged.json | Full gene catalogue: symbol, Ensembl, OMIM, UniProt, ciliopathy associations, mouse phenotype, mechanism notes, evidence PMIDs, localisation refs, expression data. |
Every gene in CiliaHub is a curated ciliary gene. The Gold Standard Ciliary Genes
are those with bona-fide ciliary localisation or function, marked in the gene catalogue by the
field evidence_tier == "Gold Standard Ciliary Genes" (the remainder are tagged
"Cilia-Associated Genes"). The list is open and free to all — no key required.
| Browser download | Use the Gold Standard Ciliary Genes CSV / JSON buttons on the home page. The export is a wide, per-gene table covering identity, ciliopathies, localisation, perturbation, orthologs, domains, and reference fields. |
| Programmatic | Filter the gene catalogue ciliahub_master_merged.json on evidence_tier (see snippets below). |
import urllib.request, json
with urllib.request.urlopen('https://ciliahub.org/data/genes/ciliahub_master_merged.json') as r:
master = json.load(r)
genes = master['genes'] if isinstance(master, dict) and 'genes' in master else master
items = genes.values() if isinstance(genes, dict) else genes
gold = [g for g in items if g.get('evidence_tier') == 'Gold Standard Ciliary Genes']
print(len(gold), 'Gold Standard ciliary genes')
print([g.get('Gene') or g.get('gene') for g in gold[:10]])
# Gold Standard gene symbols, one per line
curl -s https://ciliahub.org/data/genes/ciliahub_master_merged.json \
| jq -r '(.genes // .) | .[]
| select(.evidence_tier=="Gold Standard Ciliary Genes") | (.Gene // .gene)'
https://ciliahub.org/gene/<SYMBOL> (e.g. /gene/BBS1)https://ciliahub.org/disease/<slug> (e.g. /disease/bardet-biedl-syndrome)import urllib.request, json
# Load the disease profiles
with urllib.request.urlopen('https://ciliahub.org/data/phenotype/disease_phenotype_profiles.json') as r:
profiles = json.load(r)
# Find diseases linked to HSD17B4
with urllib.request.urlopen('https://ciliahub.org/data/phenotype/gene_to_diseases.json') as r:
g2d = json.load(r)
print(g2d['HSD17B4'])
# {'diseases': ['D-Bifunctional Protein Deficiency'], 'n_diseases': 1}
# Get all "Primary" class diseases
with urllib.request.urlopen('https://ciliahub.org/data/phenotype/class_index.json') as r:
classes = json.load(r)
print(len(classes['Primary']), 'Primary diseases')
# List all genes in the Bardet-Biedl Syndrome profile curl -s https://ciliahub.org/data/phenotype/disease_phenotype_profiles.json \ | jq -r '."Bardet-Biedl Syndrome".genes[]' # Count Primary ciliopathies curl -s https://ciliahub.org/data/phenotype/class_index.json | jq '.Primary | length' # All diseases linked to CEP290 curl -s https://ciliahub.org/data/phenotype/gene_to_diseases.json | jq '.CEP290'
The phenotype data is regenerated from Supplementary Tables S1 (gene catalogue),
S2 (disease catalogue), and S5 (symptom classification). The phenotype_meta.json file
records the source spreadsheet version.