uniprot-mcp¶
A Model Context Protocol server for the UniProt protein knowledgebase. 41 tools. Every successful response carries a verifiable
Provenancerecord (release · timestamp · URL · canonical SHA-256) that the agent (or a third party, a year later) can re-check withuniprot_provenance_verify.
What it is¶
uniprot-mcp exposes UniProt's REST surface — and a curated set of
clinical and structural-biology compositions — as typed, agent-safe
MCP tools. Two design choices set it apart:
-
Provenance on every response. Markdown footer, JSON envelope, PIR-style FASTA header — pick your format, the same record is embedded. The
uniprot_provenance_verifytool re-fetches the URL and compares the recorded release tag and canonical response SHA-256 against today's UniProt; you get one of five distinct verdicts (verified/release_drift/hash_drift/release_and_hash_drift/url_unreachable) with advice strings. -
Pre-registered, third-party-reproducible benchmark. 30 prompts (Tier A/B/C × 10) sealed via SHA-256 commitments on
main. A reviewer runspython tests/benchmark/verify_answers.pyto re-derive every answer from the live UniProt REST API in two commands.
Together these mean a regulated user can take any prior uniprot-mcp
answer and prove — without contacting the author — that UniProt
still returns the same bytes.
Tool surface (41)¶
| Family | Tools | Question it answers |
|---|---|---|
| Core UniProtKB | 10 | "what does UniProt say about this protein?" |
| Controlled vocabularies | 4 | "what's the canonical KW / SL identifier for X?" |
| Sequence archives & clusters | 4 | "what's in this UniRef cluster / UniParc record?" |
| Proteomes & literature | 4 | "what is the human reference proteome / who cites this entry?" |
| Structured cross-DB resolvers | 4 | "what PDB / AlphaFold / InterPro / ChEMBL records exist?" |
| Biomedical features | 7 | "what's at residue 175 / is R175H known / what diseases / what's the chemistry / where are the active sites / how is it processed / what PTMs?" |
| Cross-origin enrichment | 3 | "AlphaFold pLDDT / ClinVar significance / publications" |
| Composition + provenance | 5 | "give me a full target dossier / verify a recorded provenance / replay from cache / orthology / evidence-quality summary" |
When to use it¶
- You're building an LLM agent that needs citable UniProt data — not just text the model paraphrases.
- You're writing a regulatory or clinical-research workflow that requires per-query auditability.
- You want reproducibility across releases — pin via
--pin-release=YYYY_MMand the client refuses any drift. - You want to stop writing UniProt URL strings by hand and let a typed surface handle accession validation, retry, error envelopes, and rate-limit politeness.
Quickstart¶
See the quickstart guide. One-liner:
pip install uniprot-mcp-server # PyPI distribution; console script is `uniprot-mcp`
uniprot-mcp --self-test
What this project explicitly is not¶
- Not an orchestrator across multiple bio data sources. That's the
BUSL-1.1 layer in
topologica-bio.uniprot-mcpstays a permissive Apache-2.0 gateway forever. - Not a structure-prediction service. AlphaFold confidence is surfaced; structure files (CIF/PDB) are URLs the agent can fetch separately.
- Not a variant-effect predictor. ClinVar significance is surfaced; functional impact prediction belongs in dedicated tools like Ensembl VEP.
- Not a free-tier / freemium ladder. Every tool is and remains Apache-2.0.
Status¶
| Layer | State |
|---|---|
| Tool surface | 41 tools across 8 families |
| Tests | 874 offline + 44 live integration |
| Static analysis | mypy strict + ruff + bandit + pip-audit clean |
| Provenance verification | Live round-trip-tested against real UniProt |
| Pre-registered benchmark | 30 SHA-256 commitments on main |
| Mutation testing | Workflow shipped (.github/workflows/mutation.yml); measurement-first per-module baseline in docs/MUTATION_SCORES.md. ≥ 95 % kill rate is the v1.2.0 target, not the current state. |
3 × 3 CI matrix on main |
Green (Ubuntu / Windows / macOS × Python 3.11 / 3.12 / 3.13) |
Citation¶
@software{maniches2026uniprotmcp,
author = {Maniches, Santiago},
title = {uniprot-mcp: A Model Context Protocol server for the UniProt protein knowledgebase},
year = {2026},
url = {https://github.com/smaniches/uniprot-mcp},
orcid = {0009-0005-6480-1987}
}
Always also cite the UniProt Consortium:
The UniProt Consortium. UniProt: the Universal Protein Knowledgebase in 2025. Nucleic Acids Research (2025). doi:10.1093/nar/gkae1010