Research Project
UR–JEPA: Uniform Rectifiability as a Regularizer for Joint-Embedding Predictive Architectures
Bridging geometric measure theory and self-supervised learning.
LeJEPA (Balestriero & LeCun, 2025) recently identified the isotropic Gaussian as the optimal target distribution for JEPA embeddings. But the manifold hypothesis says real data concentrates on a low-dimensional subset of the ambient space, which is in tension with a full-dimensional isotropic target.
UR–JEPA resolves this tension by replacing the Gaussian target with a uniformly n-rectifiable measure: the canonical geometric-measure-theory notion of “quantitatively n-dimensional at every location and scale.” We operationalize this through a Carleson-type square-function loss built on prior work: Chousionis–Garnett–Le–Tolsa, Square functions and uniform rectifiability (TAMS, 2016).
Selected results
- +18 pp over IJEPA–IN22K foundation-model transfer on Galaxy10 SDSS (81.4% vs 62.9% linear-probe). In-domain UR–JEPA on a 21K-image astronomical dataset substantially exceeds a 630M-parameter foundation model pretrained on 22M images.
- +0.83 pp over matched-recipe LeJEPA on ImageNet-10 (3 seeds, paired-t = +15.5, p << 0.001) with ~30% lower seed variance.
- Statistically tied with LeJEPA at convergence on Galaxy10 SDSS and ImageNet-100. Sample efficiency and lower seed variance are the practical differentiators at convergence.
- Downstream transfer (ImageNet-100 → 5 datasets). UR–JEPA leads LeJEPA on 4 of 5 transfer datasets (Aircraft, DTD, Flowers, Food) at the 800-epoch checkpoint, with a mean Δ of +0.32 pp (single seed; 3-seed verification pending).
- Geometrically distinct representation. An effectively low-rank covariance structure where LeJEPA yields near-isotropic projections. The covariance cliff sits at index ~20–25 across three datasets, consistent with Pope et al. (ICLR 2021) intrinsic-dimension estimates for natural images.