Skip to content

Structure-intelligence tools

alphafold_sovereign.tools.structure_intelligence

Structure-intelligence MCP tools.

These tools take raw AlphaFold structures (PDB text and PAE arrays from AlphaFold DB) and derive a small set of summaries from them:

  • pLDDT / PAE confidence-and-domain maps,
  • a 64-dimensional topological-data-analysis (TDA) fingerprint (Betti numbers β₀, β₁, β₂ from a Vietoris-Rips filtration of Cα coordinates, computed with gudhi when the optional [tda] extra is installed; a coarse fallback without persistent homology otherwise — see _compute_lightweight_tda),
  • a pairwise distance matrix between TDA fingerprints (the implementation is an L2 distance on length-normalised fingerprint vectors — it is not an optimal-transport Wasserstein distance; see _fingerprint_distance),
  • a geometric pocket-detection heuristic, and
  • an intrinsic-disorder-region map.

All tools are read-only. Tools that require outbound HTTP raise when ALPHAFOLD_OFFLINE=1 and the requested structure is not in the local cache.

Tool inventory
  1. analyze_structural_confidence — pLDDT + PAE domain map
  2. compute_topology_fingerprint — TDA Betti numbers (β₀, β₁, β₂)
  3. compare_proteins_topologically — Pairwise fingerprint-distance matrix
  4. find_evolutionary_structural_shifts — Cross-species TDA fingerprint compare
  5. score_binding_pocket_geometry — Heuristic pocket detection + score
  6. detect_intrinsically_disordered — IDR region map
  7. assess_structural_novelty — AlphaFold-coverage and confidence summary
  8. identify_allosteric_sites — PAE-based long-range coupling map

analyze_structural_confidence async

analyze_structural_confidence(params: UniProtInput) -> dict[str, Any]

Analyze AlphaFold structural confidence using pLDDT and PAE matrices.

Returns a multi-layered structural reliability assessment: - pLDDT (per-residue): mean confidence, low-confidence segments (disordered/novel) - PAE (predicted aligned error): inter-domain uncertainty, domain boundaries - Druggability pre-screen: high-pLDDT + low-PAE regions → ordered pockets

pLDDT interpretation

90: Very high confidence — likely correct at backbone + sidechain level 70–90: High confidence — backbone correct, some sidechain uncertainty 50–70: Low confidence — may be IDP or novel fold < 50: Very low — disordered or no structure deposited

Parameters:

Name Type Description Default
params.uniprot_id

UniProt accession.

required

compute_topology_fingerprint async

compute_topology_fingerprint(params: UniProtInput) -> dict[str, Any]

Compute a topological fingerprint for a protein structure.

Uses persistent homology (Vietoris-Rips filtration) over the Cα coordinate cloud to derive a 64-dimensional fingerprint vector and Betti numbers β₀, β₁, β₂. Requires gudhi (install with pip install alphafold-sovereign-mcp[tda]); without gudhi, a coarse fallback runs that does not compute persistent homology (see _fallback_tda_fingerprint).

What the Betti numbers count, intuitively:

  • β₀ — connected components of the Vietoris-Rips complex at the chosen filtration scale. Distinguishes single-domain from multi-domain or fragmented chains.
  • β₁ — 1-dimensional holes / loops. Picks up ring-like topology (e.g. β-barrels, large macrocycles).
  • β₂ — 2-dimensional voids. Picks up enclosed cavities.

Topological features are invariant to rigid-body rotation and translation. They are not a substitute for sequence alignment, RMSD, or functional homology assessment; they are a coarse, geometry-only summary.

Parameters:

Name Type Description Default
params.uniprot_id

UniProt accession.

required

compare_proteins_topologically async

compare_proteins_topologically(params: MultiProteinInput) -> dict[str, Any]

Compare multiple proteins using a TDA-fingerprint distance.

Computes a pairwise distance matrix between the TDA fingerprints of the provided proteins. Distance metric: L2 distance between length-normalised 64-dimensional fingerprint vectors (see _fingerprint_distance). Distance = 0 means identical fingerprints; larger values mean more divergent fingerprints. This is not a Wasserstein distance between persistence diagrams.

Applications: Possible uses (all of which require independent validation before any downstream use):

  • Drug-repurposing triage: proteins with low fingerprint distance may share gross topology.
  • Off-target screening: family members with near-zero distance.
  • Cross-species comparison of the same gene's structure.

None of these are direct functional or sequence-similarity measures.

Parameters:

Name Type Description Default
params.uniprot_ids

2–10 UniProt accessions.

required

find_evolutionary_structural_shifts async

find_evolutionary_structural_shifts(params: EvolutionaryInput) -> dict[str, Any]

Quantify structural divergence across species using TDA.

Unlike sequence-based phylogenetics, this tool measures STRUCTURAL drift — proteins that have diverged in fold even while retaining sequence motifs (a hallmark of convergent evolution and functional shift).

Use cases: - Pandemic preparedness: quantify how much a pathogen's surface protein has drifted from the human homolog (affects cross-reactive antibodies) - Drug safety: off-target risk in model organisms (high drift → poor model) - Zoonotic spillover risk: structural similarity to reservoir-host proteins - Vaccine design: identify conserved structural epitopes across strains

Parameters:

Name Type Description Default
params.gene_symbol

Human gene symbol.

required
params.target_species

List of species to compare.

required

score_binding_pocket_geometry async

score_binding_pocket_geometry(params: BindingPocketInput) -> dict[str, Any]

Identify and score putative binding pockets from AlphaFold geometry.

Detects pockets with a geometry-only heuristic. Residues in the inner 60 percent of the structure by distance from the centroid are treated as buried, then grown greedily into clusters within an 8 Angstrom radius. A cluster is kept as a putative pocket when it has at least min_pocket_residues members and a mean pLDDT of at least 50.

Each pocket reports a radius of gyration (compactness of the pocket residues), a burial value (distance of the pocket centroid from the structure centroid), a mean pLDDT, and a druggability index. The druggability index runs 0 to 100 and is the sum of four equally weighted 0 to 25 sub-scores: residue count, radius of gyration, mean pLDDT, and burial.

This is a fast, dependency-free pre-screen, not a substitute for a validated pocket detector such as fpocket or P2Rank. It needs no ML model, is fully reproducible from AlphaFold coordinates, and runs in air-gapped deployments.

Parameters:

Name Type Description Default
params.uniprot_id

UniProt accession.

required
params.min_pocket_residues

Minimum pocket size (residues).

required

detect_intrinsically_disordered async

detect_intrinsically_disordered(params: UniProtInput) -> dict[str, Any]

Map intrinsically disordered regions (IDRs) using pLDDT as proxy.

IDRs with pLDDT < 50 are predicted to be disordered in isolation by AlphaFold. This approach is validated by Ruff & Pappu (2021) and is the highest-throughput IDR detection method available for the full human proteome.

IDR functional categories returned: - Linkers: short (< 20 aa) disordered regions between domains - Tails: N/C terminal IDRs - Long IDRs: candidate intrinsically disordered protein (IDP) segments

Clinical relevance: - IDRs are enriched for disease-causing mutations (40% of cancer driver mutations) - IDRs host post-translational modification sites (phosphorylation, ubiquitination) - Long IDRs are emerging drug targets (targeted covalent inhibitors, phase separation modulators)

Reference

Ruff KM & Pappu RV. J Mol Biol. 2021;433(20):167208.

Parameters:

Name Type Description Default
params.uniprot_id

UniProt accession.

required