Notes / Deep Learning / Attention Mechanisms Attention Mechanisms Self-attention, multi-head attention, flash attention, and cross-attention 1. Attention Mechanisms Understanding attention in neural networks 2. Advanced Attention Architectures WIP MQA, GQA, SWA, MLA, and Dynamic Sparse Attention mechanisms