Nikita Breskanu Research Blog

Welcome to my research blog!

Here I share personal projects and research notes in deep learning, with a focus on optimization methods and model compression.

Posts

Apr 21, 2026
Fisher-based Optimizers in Deep Learning
This post summarizes the natural-gradient view of deep learning optimization and reviews practical Fisher-based approximations for linear layers in chronological order. The central question is how methods such as KFAC, EKFAC, Shampoo and SOAP make structured preconditioning cheap enough to use in neural networks.
Apr 18, 2026
Properties of the Fisher Information Matrix
This post introduces the Fisher Information Matrix and develops its main statistical and geometric properties. It concludes with a short discussion of what Fisher singularity means and how it arises in overparameterized models.
Feb 9, 2026
Introducing My Pruning Library
LLM pruning research is often hindered by the engineering complexity of reproducing activation-aware methods, which usually require custom hooks and intricate layer-wise management. To lower the barrier for experimentation, I developed nn-pruning: a modular PyTorch toolkit that standardizes activation collection and benchmarking. By decoupling pruning logic from the underlying model infrastructure, the project allows researchers to implement and compare new algorithms like Wanda or SparseGPT with minimal boilerplate.
Jan 25, 2026
FROG: My attempt to create efficient second-order optimizer
FROG (Fisher ROw-wise PreconditioninG) is a second-order optimizer based on row-wise Fisher preconditioning. It uses joint Conjugate Gradient solves to approximate natural-gradient updates with low computational overhead. Fisher trace–based normalization ensures scale-free updates. The method is applicable to linear and convolutional layers and requires only a small number of CG iterations in practice. Implementation is available at GitHub.
Dec 1, 2025
Unstructured Pruning Methods
This note provides a personal mathematical deep-dive into unstructured pruning methods. I first cover one-shot methods including Optimal Brain Surgeon, SparseGPT, and Wanda, followed by training-based approaches such as Movement Pruning and oBERT. To my knowledge, this is a unique synthesis that provides both rigorous mathematical derivations and explicit connections between these disparate frameworks.

Posts

Fisher-based Optimizers in Deep Learning

Properties of the Fisher Information Matrix

Introducing My Pruning Library

FROG: My attempt to create efficient second-order optimizer

Unstructured Pruning Methods