Liquid

LFM2-24B-A2B

LFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures designed for efficient on-device deployment. Built as a 24B parameter Mixture-of-Experts model with only 2B active parameters per token, it delivers high-quality generation while maintaining low inference costs. The model fits within 32 GB of RAM, making it practical to run on consumer laptops and desktops without sacrificing capability.

Input / 1M tokens: $0.030
Output / 1M tokens: $0.120
Context window: 33K tokens
Provider: Liquid

Performance

Median streaming throughput and first-token latency measured by Artificial Analysis.

Output tokens / sec: —
Time to first token: —