Mathematical Theory¶
This page provides the mathematical background for GROUPOID. We assume familiarity with basic differential geometry and category theory.
Transport Groupoids¶
A groupoid is a category in which every morphism is invertible. In the federated learning context, the transport groupoid captures how model parameters are related across clients.
Given a network of \(n\) clients, define a directed graph \(G = (V, E)\) where each node \(v_i \in V\) represents a client and each edge \((v_i, v_j) \in E\) carries a transport map \(T_{ij}\) that translates parameters from client \(i\)'s space to client \(j\)'s space.
The groupoid axioms require:
- Identity: \(T_{ii} = \text{Id}\) for all \(i\)
- Inverse: \(T_{ji} = T_{ij}^{-1}\) for all \((i, j) \in E\)
- Composition: \(T_{ik} = T_{jk} \circ T_{ij}\) whenever \((i, j)\) and \((j, k)\) are in \(E\)
Karcher Mean on Riemannian Manifolds¶
When model parameters live on a Riemannian manifold \((M, g)\), the standard Euclidean average is not well-defined. Instead, we use the Karcher mean (also called the Frechet mean), defined as the point minimizing the sum of squared geodesic distances:
where \(d_g\) is the geodesic distance and \(w_i\) are non-negative weights summing to 1.
The Karcher mean is computed iteratively:
- Initialize \(\bar{x}_0\) (e.g., to \(x_1\))
- Compute \(v = \sum_i w_i \, \text{Log}_{\bar{x}_k}(x_i)\)
- Update \(\bar{x}_{k+1} = \text{Exp}_{\bar{x}_k}(\epsilon \, v)\)
- Repeat until \(\|v\| < \text{tol}\)
Here \(\text{Log}\) and \(\text{Exp}\) denote the Riemannian logarithmic and exponential maps.
First Cohomology and Global Consistency¶
The first cohomology \(H^1\) of a transport cocycle measures the obstruction to global consistency. Given transport maps \(\{T_{ij}\}\) on a graph \(G\):
- A cocycle is an assignment of transport maps to edges satisfying \(T_{ik} = T_{jk} \circ T_{ij}\) around every triangle.
- A coboundary is a cocycle that can be written as \(T_{ij} = g_j \circ g_i^{-1}\) for some gauge transformations \(\{g_i\}\) at each node.
- \(H^1 = 0\) if and only if every cocycle is a coboundary.
In practice, we compute \(H^1\) by checking the holonomy around each cycle in a cycle basis of \(G\):
If \(\text{Hol}(\gamma) = \text{Id}\) for all basis cycles \(\gamma\), then \(H^1 = 0\) and the local models are globally consistent.
The holonomy is the ordered product over every edge of the cycle, so the
cocycle must be fully specified: each cycle edge needs a transport map in one
direction or the other (the reverse direction is inverted). If an edge map is
missing, the holonomy is undefined — a product over a strict subset of the
cycle's edges is not a holonomy and must not be reported as a consistency
result. compute_h1 therefore raises IncompleteCocycleError, naming the
missing edge, instead of silently forming a partial product.
Sheaf Theory and Restriction Maps¶
A cellular sheaf \(\mathcal{F}\) on \(G\) assigns:
- A vector space \(\mathcal{F}(v)\) to each node \(v\) (the stalk)
- A linear map \(\mathcal{F}_{v \to e} : \mathcal{F}(v) \to \mathcal{F}(e)\) to each incidence (the restriction map)
A global section is an assignment \(s(v) \in \mathcal{F}(v)\) for each node such that the restriction maps agree on shared edges:
for every edge \(e = (u, v)\).
The sheaf Laplacian \(L_{\mathcal{F}}\) generalizes the graph Laplacian and its kernel equals the space of global sections. A nonzero kernel indicates that the local data can be consistently glued into a global model.
Connection to Federated Learning¶
| Mathematical Concept | Federated Learning Analogue |
|---|---|
| Node \(v_i\) | Client \(i\) |
| Stalk \(\mathcal{F}(v_i)\) | Client \(i\)'s parameter space |
| Transport map \(T_{ij}\) | Parallel transport between parameter spaces |
| Karcher mean | Geometric model aggregation |
| \(H^1 = 0\) | Local models are globally consistent |
| \(H^1 \neq 0\) | Obstruction to consistent aggregation |
| Global section | A successfully aggregated global model |