Jan 30, 2026 Implementing Flash Attention: Backward Pass in Triton Jul 17, 2025 View Transformer Layers from Online Optimization Perspective