Apr 19, 2025 Subquadratic Distillation: What Matters in Linearizing Language Models? A Comparative Study of Architecture, Scale, and Task Adaptation