Workshop description

Over the last decade, deep learning has brought about astonishing improvements in computer vision, signal processing, language modeling and beyond. This unprecedented success is due to a huge collective endeavor to build large-scale models. As a result, the remarkable performance of these models has largely surpassed our understanding of them.

An established approach to analyzing these models is to build a bottom-up theory of deep learning, where rigorous proofs are devised to explain deep networks. However, these models usually are only tractable when they are oversimplified. Although valuable progress has been made on that front, the complexity of real high-dimensional data and deep network architectures used in practice make these models resistant to traditional mathematical analysis. Hence, many aspects remain mysterious and our understanding of their success and failure modes remains very limited.

The aim of this workshop is to promote a complementary approach to further our understanding of deep learning, through the lens of the scientific method. This approach uses carefully designed experiments in order to answer precise questions about how and why deep learning works. The scientific method has been used successfully in the past to validate or falsify hypotheses, (e.g., deep networks generalize because they cannot fit random labels [Zhang et al]), challenge common assumptions (e.g., deep networks are robust to imperceptible input perturbations [Szegedy et al]), or reveal empirical regularities (e.g., discovering scaling laws [Kaplan et al]). These well-known examples all crucially rely on controlled experiments, and constitute an important part of our current understanding of deep learning.

Although these works aimed neither to improve the state-of-the-art nor to prove theorems, they have had a profound impact, spurring many follow-up theoretical and applied works. Indeed, such results serve the theory of deep learning by grounding it with empirical observations (e.g., training occurs on the edge of stability [Cohen et al]) or formulating conjectures (e.g., the lottery ticket hypothesis [Frankle & Carbin]). Simultaneously, they lead to practical improvements by informing engineering decisions, e.g., compute-optimal scaling [Hoffmann et al], and spurring new research directions, e.g., adversarial robustness [Gu & Rigazio]. Thus, we believe, that this workshop will be of interest to both theoretical and applied communities.

Despite their significant impact, these studies have been largely underexplored and underappreciated. While the criteria for assessing quantitative contributions such as improving state-of-the-art performance or for proving rigorous theorems are more clear-cut, assessing the significance of contributions within this category are still developing and forming. Our workshop would offer a venue for studies that fall outside the standard acceptance criteria yet have a high impact potential. Thus, our workshop significantly differs from and complements past workshops accepted at main machine learning conferences.

The scientific study of deep learning is currently scattered across several subfields, including incontext learning in transformers, generalization properties of generative models, inductive biases of learning algorithms, (mechanistic) interpretability, empirical studies of loss landscapes, training dynamics, and learned weights and representations. This workshop will facilitate communication and collaboration across subfields by building a community centered around a common approach.