Artisan Times

Beyond the Headlines

Social Media

Google STATIC Framework Boosts LLM Retrieval Speed

Google STATIC Framework Boosts LLM Retrieval Speed

The Google STATIC framework is changing how large models handle constrained decoding. Researchers from Google DeepMind and YouTube introduced the system to improve generative retrieval.
Generative Retrieval replaces traditional search with Large Language Models. Instead of embeddings, models generate Semantic IDs as token sequences. However, this method struggles with business rules like freshness or stock limits. As a result, models can produce invalid or unavailable items. That creates risk for real-world platforms.

From Tries to Sparse Matrices

Developers often use prefix trees, or tries, to block invalid outputs. Yet tries run poorly on TPUs and GPUs. They rely on pointer-based memory access and dynamic branching.
Therefore, accelerator hardware cannot optimize them well. Static graphs like XLA also struggle with such structures.
The Google STATIC framework solves this issue. It converts the trie into a Compressed Sparse Row matrix. This allows fully vectorized operations on hardware accelerators.
In addition, STATIC uses a hybrid decoding method. Early layers apply dense masking for fast lookups. Deeper layers use a branch-free kernel to keep computation static.
This design achieves O(1) complexity. Older methods scaled logarithmically, which slowed performance as constraints grew.

Real-World Impact and Results

Tests on TPU v6e showed strong gains. STATIC reduced per-step latency to just 0.033 milliseconds. That equals a 948x speedup over CPU-based tries.
Deployment on YouTube enforced a seven-day freshness rule. The system handled 20 million items with full compliance. Consequently, fresh video views increased by 5.1%.
Click-through rates also rose slightly. Moreover, the system improved cold-start recommendations on Amazon Reviews datasets. Overall, STATIC proves that smart architecture design can unlock massive efficiency gains in AI retrieval systems.

Artisan Times

About Author

Leave a comment

Your email address will not be published. Required fields are marked *

You may also like

Social Media

TikTok to Slash UK Moderation Jobs as AI Takes Over

TikTok is cutting hundreds of UK jobs in its content moderation teams, sparking outrage from unions who say the move
Social Media

Corruption scandal threatens Argentina’s right-wing President Milei and his influential sister Audio recordings leaked to the press allegedly implicate President Javier Milei and his sister,

Karina Milei, in the misuse of public funds from Argentina’s National Disability Agency.Protesters reacted with anger, pelting Milei with stones