FROG: My attempt to create efficient second-order optimizer
FROG (Fisher ROw-wise PreconditioninG) is a second-order optimizer based on row-wise Fisher preconditioning. It uses joint Conjugate Gradient solves to approximate natural-gradient updates with low computational overhead. Fisher trace–based normalization ensures scale-free updates. The method is applicable to linear and convolutional layers and requires only a small number of CG iterations in practice. Implementation is available at GitHub.
Download: frog-technical-overview.pdf
Technical Overview