Project Status¶
Version: v1.1.7 Stage: Early open source. Engineering-grade infrastructure; scientifically unvalidated by independent domain experts.
This document exists so that any reader — reviewer, auditor, potential contributor, downstream user — can in 60 seconds form an accurate expectation of what this project is and is not.
What is solid¶
Code architecture¶
- Five subpackages with clear single responsibilities —
clients/(10 upstream-API modules),domain/(1 module),tools/(4 MCP-tool modules),storage/(1 SQLite KG module),server/(2 transport modules). 18 substantive.pyfiles on the shipped surface. - The previous monolith (~6,000 lines) is archived under
_archive/legacy/and excluded from coverage and lint. - No circular imports; client retry/circuit-breaker logic is a single module reused by every upstream client.
Test suite¶
- 677 tests across 20 modules.
- 100% line + branch coverage on the shipped surface
(
src/alphafold_sovereign/, excluding the archived monolith). - Tests use
respxto mock HTTP semantics (not just return values),hypothesisfor property tests on parsers, andpytest-asynciofor async client tests. - The ACMG mapping, druggability tier scoring, and KG queries all have parametrised tests covering known input/output pairs from the implementation.
Security & supply chain¶
- Bandit + Safety + pip-audit on every PR.
- CodeQL
security-extendedon every push (public repo). - SBOM (CycloneDX + SPDX) generated on every release tag.
- SLSA L3 in-toto build provenance + Sigstore (
cosign) keyless signing of every release artefact. - PyPI publishing via OIDC Trusted Publishing (no API tokens stored in repo secrets).
- SQL parameterised everywhere; CWE-89 closed.
Distribution¶
- Published to PyPI at https://pypi.org/project/alphafold-sovereign-mcp/
(install with
pip install alphafold-sovereign-mcp). - Zenodo concept DOI: 10.5281/zenodo.20134773 — version-independent identifier that redirects to the latest archived version. Each tagged GitHub Release mints its own version-specific DOI under this concept via the GitHub-Zenodo integration (verified on v1.1.3 at 10.5281/zenodo.20262463).
- mkdocs documentation site auto-deploys to GitHub Pages on every push to main.
CI matrix¶
- Python 3.10 / 3.11 / 3.12 / 3.13 on Ubuntu and macOS.
- Lint (ruff), format (ruff format), type-check (mypy strict), build (sdist + wheel), MCP schema validation.
Legal kit¶
- Apache 2.0 (
LICENSE). Pure Apache 2.0 — no commercial-edition carve-out. NOTICE,PATENTS,TRADEMARKS,CONTRIBUTING,SECURITY,CODE_OF_CONDUCT,GOVERNANCE,INCIDENT_RESPONSE,AUDIT,PRIVACY,SUPPORT.
What is NOT solid¶
Scientific / clinical validation¶
- No clinical geneticist has reviewed the ACMG mapping. The
implementation follows the 2015 Richards et al. guidelines but no
independent expert has signed off on the criterion-by-criterion
mapping. Use
generate_variant_clinical_reportorclassify_variant_acmgas a research aid, never as clinical decision support. - The druggability tier scoring is a heuristic. Cut-offs for the HOT / WARM / COLD / NOT_DRUGGABLE buckets were chosen by the author based on rough literature priors. They have not been calibrated against a benchmark of known druggable / non-druggable targets and are not citation-backed.
- No end-to-end validation against real-world cases. The test suite mocks every upstream API. The pipeline has not been run against a held-out set of variants/targets/diseases with known expected outputs.
- No outcome data. Nobody has measured how often the report agrees with a human geneticist, or how often the druggability tier predicts actual drug-discovery success.
Operational¶
- No production deployment. This software has never been deployed as a long-running service for real users. Memory, latency, rate-limit, and failure-mode behaviour at scale is unknown.
- No usage telemetry. We do not collect any data on which tools are called, with what arguments, or whether results are useful.
- No SLA. Upstream APIs (Ensembl, Open Targets, ClinVar, gnomAD, AlphaFold DB, etc.) can change schema or go down; we do best-effort retries but make no availability guarantees.
Project maturity¶
- Single maintainer. No bus factor > 1.
- No external contributors yet. Review process is documented in
CONTRIBUTING.mdbut has not been exercised. - No formal release cadence. v1.1.7 is the current release; later versions will be tagged as the validation milestones below are met.
What this means for users¶
| If you are … | You can use this project for … | You should NOT use this project for … |
|---|---|---|
| A researcher exploring a target | Pulling and joining data from 14 upstream sources via MCP | Making a final go/no-go decision on a drug programme |
| A clinical geneticist | Quickly assembling a literature snapshot for a variant | Issuing a clinical report; ACMG calls produced here are not validated and should be re-derived from raw sources |
| A platform engineer | Studying a tested example of an MCP server with retries | Production deployment without your own validation, monitoring, and SLA work |
| A bioinformatician | Prototyping a workflow that calls 14 sources behind one API | Reproducible publication-grade analyses (upstream APIs are not pinned by us) |
Roadmap to v1.2.0 (validation)¶
The validation gap is the highest-priority work after v1.1.7. The planned, sequenced steps are:
- End-to-end golden examples. Three documented notebooks under
examples/running the full pipeline against well-characterised variants (BRCA1 c.5266dupC, TP53 R175H, EGFR L858R) with expected output stored as JSON and diffed in CI. - ACMG traceability matrix. A markdown table mapping each criterion (PVS1, PS1–4, PM1–6, PP1–5, BA1, BS1–4, BP1–7) to the line of code that implements it, the test that exercises it, and the 2015 Richards et al. section it derives from.
- External review. One clinical geneticist to review the ACMG mapping; one medicinal chemist to review the druggability heuristic. Findings published as issues, then closed by PRs.
- Benchmark calibration. Run druggability on a held-out set of approved-drug targets vs. failed-development targets and report precision/recall.
- Schema pinning. Pin upstream API schemas to specific dates, with a documented refresh policy.
Last updated¶
2026-05-17. This document is part of the repo; PRs to correct or expand it are welcome.