Information Bottleneck & Representation Learning

How do we find the signal inside the noise? My core research agenda builds on the Information Bottleneck principle: given two variables, find a compressed representation of one that retains only what is relevant for predicting the other.

This deceptively simple idea underlies a unified family of machine learning methods — from standard autoencoders to variational methods and contrastive losses — and provides a rigorous lens for understanding why they work and when they fail.

Key results: A variational framework unifying a broad class of representation learning objectives (JMLR 2025); a data-efficient method for jointly reducing two modalities while preserving their shared information (TMLR 2024); improved estimators of mutual information in high dimensions (arXiv 2025); application of the IB principle to learning low-dimensional phase-space representations from high-dimensional experimental data (arXiv 2026).