Qwen Open-Sources FlashQLA, Delivering 3x Faster Linear Attention and Outperforming FlashInfer by 5x
Qwen Team has officially open-sourced FlashQLA, a high-performance operator library specifically designed for the Gated Delta Network (GDN), the linear attention layer that powers the entire Qwen series, includ...