rohit.vision
Notes Graph Search IDE About Portfolio
Notes / Deep Learning / Transformers

Transformers

Transformer architecture and variants

1.
The Transformer Architecture WIP
Self-Attention, Multi-Head Attention, and the Encoder-Decoder structure
2.
Titans (Google Research) WIP
Learning to memorize at test time and deep memory architectures
GitHub LinkedIn Google Scholar

© 2026 Rohit Kumar. rohit.vision