We provide an alternative derivation for the asymptotic spectrum of non-linear random matrices, based on the more robust resolvent model. Our approach in particular extends previous results on random feature models to the practically important case with additive bias.
We show that the generalization error of deep random feature models is the same as the generalization error of Gaussian features with matched covariance, and derive an explicit expression for the generalization error.
We derive an approximative formula for the generalization error of deep neural networks with structured (random) features, confirming a widely believed conjecture. We also show that our results can capture feature maps learned by deep, finite-width neural networks trained under gradient descent.