Investigating the Effect of Model Capacity Constraints on Belief State Representations
We examine the effect of weight decay on the fractal structure of belief state representations in a transformer’s residual stream, finding that models trained with increasing weight decay coefficients learn increasingly coarse-grained belief state representations.
3rd Place, Apart Research Computational Mechanics Hackathon, June 2024