Thu, Apr 21, 2022 @ 11:00 AM - 12:30 PM
Thomas Lord Department of Computer Science
Title: On information captured by neural networks: connections with memorization, generalization, and learning dynamics
Despite their enormous capacity, modern neural networks generalize well. In this thesis proposal we use ideas from information theory to address various aspects of this phenomenon. We show that reducing label-noise information in network weights reduces memorization and improves generalization. We propose definitions for information content of data and introduce an efficient algorithm for estimating it. These definitions allow us to quantify amount of memorization of particular examples. Finally, we derive information-theoretic generalization gap bounds that depend on average information content of a single example. We demonstrate that these bounds are non-vacuous in the practical scenarios for deep learning.
Aram Galstyan (advisor, CS)
Greg Ver Steeg (advisor, CS)
Haipeng Luo (CS)
Bistra Dilkina (CS)
Mahdi Soltanolkotabi (EE)
Date: Thursday, April 21, 11:00am-12:30pm.
Audiences: Everyone Is Invited
Contact: Lizsl De Leon