SLA2: Sparse-Linear Attention with Learnable Routing and QAT - Explained Simply | ArXiv Explained

SLA2: Sparse-Linear Attention with Learnable Routing and QAT - Explained Simply | ArXiv Explained