Structure-intelligence tools¶
alphafold_sovereign.tools.structure_intelligence ¶
Structure-intelligence MCP tools.
These tools take raw AlphaFold structures (PDB text and PAE arrays from AlphaFold DB) and derive a small set of summaries from them:
- pLDDT / PAE confidence-and-domain maps,
- a 64-dimensional topological-data-analysis (TDA) fingerprint
(Betti numbers β₀, β₁, β₂ from a Vietoris-Rips filtration of Cα
coordinates, computed with
gudhiwhen the optional[tda]extra is installed; a coarse fallback without persistent homology otherwise — see_compute_lightweight_tda), - a pairwise distance matrix between TDA fingerprints (the
implementation is an L2 distance on length-normalised fingerprint
vectors — it is not an optimal-transport Wasserstein distance;
see
_fingerprint_distance), - a geometric pocket-detection heuristic, and
- an intrinsic-disorder-region map.
All tools are read-only. Tools that require outbound HTTP raise when
ALPHAFOLD_OFFLINE=1 and the requested structure is not in the
local cache.
Tool inventory
- analyze_structural_confidence — pLDDT + PAE domain map
- compute_topology_fingerprint — TDA Betti numbers (β₀, β₁, β₂)
- compare_proteins_topologically — Pairwise fingerprint-distance matrix
- find_evolutionary_structural_shifts — Cross-species TDA fingerprint compare
- score_binding_pocket_geometry — Heuristic pocket detection + score
- detect_intrinsically_disordered — IDR region map
- assess_structural_novelty — AlphaFold-coverage and confidence summary
- identify_allosteric_sites — PAE-based long-range coupling map
analyze_structural_confidence
async
¶
Analyze AlphaFold structural confidence using pLDDT and PAE matrices.
Returns a multi-layered structural reliability assessment: - pLDDT (per-residue): mean confidence, low-confidence segments (disordered/novel) - PAE (predicted aligned error): inter-domain uncertainty, domain boundaries - Druggability pre-screen: high-pLDDT + low-PAE regions → ordered pockets
pLDDT interpretation
90: Very high confidence — likely correct at backbone + sidechain level 70–90: High confidence — backbone correct, some sidechain uncertainty 50–70: Low confidence — may be IDP or novel fold < 50: Very low — disordered or no structure deposited
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
params.uniprot_id
|
UniProt accession. |
required |
compute_topology_fingerprint
async
¶
Compute a topological fingerprint for a protein structure.
Uses persistent homology (Vietoris-Rips filtration) over the Cα
coordinate cloud to derive a 64-dimensional fingerprint vector and
Betti numbers β₀, β₁, β₂. Requires gudhi (install with
pip install alphafold-sovereign-mcp[tda]); without gudhi, a
coarse fallback runs that does not compute persistent homology
(see _fallback_tda_fingerprint).
What the Betti numbers count, intuitively:
- β₀ — connected components of the Vietoris-Rips complex at the chosen filtration scale. Distinguishes single-domain from multi-domain or fragmented chains.
- β₁ — 1-dimensional holes / loops. Picks up ring-like topology (e.g. β-barrels, large macrocycles).
- β₂ — 2-dimensional voids. Picks up enclosed cavities.
Topological features are invariant to rigid-body rotation and translation. They are not a substitute for sequence alignment, RMSD, or functional homology assessment; they are a coarse, geometry-only summary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
params.uniprot_id
|
UniProt accession. |
required |
compare_proteins_topologically
async
¶
Compare multiple proteins using a TDA-fingerprint distance.
Computes a pairwise distance matrix between the TDA fingerprints of
the provided proteins. Distance metric: L2 distance between
length-normalised 64-dimensional fingerprint vectors (see
_fingerprint_distance). Distance = 0 means identical
fingerprints; larger values mean more divergent fingerprints. This
is not a Wasserstein distance between persistence diagrams.
Applications: Possible uses (all of which require independent validation before any downstream use):
- Drug-repurposing triage: proteins with low fingerprint distance may share gross topology.
- Off-target screening: family members with near-zero distance.
- Cross-species comparison of the same gene's structure.
None of these are direct functional or sequence-similarity measures.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
params.uniprot_ids
|
2–10 UniProt accessions. |
required |
find_evolutionary_structural_shifts
async
¶
Quantify structural divergence across species using TDA.
Unlike sequence-based phylogenetics, this tool measures STRUCTURAL drift — proteins that have diverged in fold even while retaining sequence motifs (a hallmark of convergent evolution and functional shift).
Use cases: - Pandemic preparedness: quantify how much a pathogen's surface protein has drifted from the human homolog (affects cross-reactive antibodies) - Drug safety: off-target risk in model organisms (high drift → poor model) - Zoonotic spillover risk: structural similarity to reservoir-host proteins - Vaccine design: identify conserved structural epitopes across strains
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
params.gene_symbol
|
Human gene symbol. |
required | |
params.target_species
|
List of species to compare. |
required |
score_binding_pocket_geometry
async
¶
Identify and score putative binding pockets from AlphaFold geometry.
Detects pockets with a geometry-only heuristic. Residues in the
inner 60 percent of the structure by distance from the centroid are
treated as buried, then grown greedily into clusters within an 8
Angstrom radius. A cluster is kept as a putative pocket when it has
at least min_pocket_residues members and a mean pLDDT of at
least 50.
Each pocket reports a radius of gyration (compactness of the pocket residues), a burial value (distance of the pocket centroid from the structure centroid), a mean pLDDT, and a druggability index. The druggability index runs 0 to 100 and is the sum of four equally weighted 0 to 25 sub-scores: residue count, radius of gyration, mean pLDDT, and burial.
This is a fast, dependency-free pre-screen, not a substitute for a validated pocket detector such as fpocket or P2Rank. It needs no ML model, is fully reproducible from AlphaFold coordinates, and runs in air-gapped deployments.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
params.uniprot_id
|
UniProt accession. |
required | |
params.min_pocket_residues
|
Minimum pocket size (residues). |
required |
detect_intrinsically_disordered
async
¶
Map intrinsically disordered regions (IDRs) using pLDDT as proxy.
IDRs with pLDDT < 50 are predicted to be disordered in isolation by AlphaFold. This approach is validated by Ruff & Pappu (2021) and is the highest-throughput IDR detection method available for the full human proteome.
IDR functional categories returned: - Linkers: short (< 20 aa) disordered regions between domains - Tails: N/C terminal IDRs - Long IDRs: candidate intrinsically disordered protein (IDP) segments
Clinical relevance: - IDRs are enriched for disease-causing mutations (40% of cancer driver mutations) - IDRs host post-translational modification sites (phosphorylation, ubiquitination) - Long IDRs are emerging drug targets (targeted covalent inhibitors, phase separation modulators)
Reference
Ruff KM & Pappu RV. J Mol Biol. 2021;433(20):167208.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
params.uniprot_id
|
UniProt accession. |
required |