Startup Subquadratic Claims Sparse-Attention LLM Cuts Costs and Boosts Speed

June 20, 2026

Announces SubQ, a sparse-attention LLM purportedly 56 times faster than FlashAttention-based models in speed benchmarks
Scores 89.7% on LiveCodeBench, placing it alongside top coding models from OpenAI, Google DeepMind, and Anthropic
Claims dramatic cost reduction, citing $8 versus $2,600 to run the same large-dataset retrieval test against Anthropic's Opus 4.6
Offers a 12-million-token context window, roughly 12 times larger than most current frontier models
Third-party evaluator Appen validates core architectural claims, though SubQ remains unavailable for broad public testing
Founders assert transformers could become obsolete within years if sparse-attention approaches gain wider adoption