Researchers say they trained a foundation model from scratch for about $1,500

Sapient researchers claim they trained a foundation language model from scratch for approximately $1,500, challenging the conventional wisdom that building LLMs requires millions in compute and massive datasets.

The team developed HRM-Text, which replaces standard Transformer architectures with a Hierarchical Recurrent Model (HRM). This alternative architecture achieves dramatic sample efficiency by decoupling computation into two layers: a slow-evolving strategic layer and a fast-evolving execution layer. Rather than relying on brute-force autoregressive prediction across raw text, HRM-Text trains exclusively on structured representations.

The cost reduction matters because training foundation models typically demands internet-scale datasets and thousands of GPUs, placing the technology out of reach for most enterprises and research groups. Sapient's approach suggests this barrier may not be immovable.

HRM represents the latest in a growing line of research questioning whether the current scaling paradigm, which prioritizes data volume and computational power, remains the optimal path. The model builds on HRM's predecessor, introduced last year, which demonstrated 100x faster reasoning than standard LLMs using only 1,000 training examples.

The core insight is architectural rather than brute-force. By restructuring how models process and predict text, Sapient achieves efficiency gains that bypass the need for massive parameter counts and enormous training runs. This aligns with emerging research suggesting that model architecture and training methodology matter as much as sheer scale.

However, the claim warrants scrutiny. Training cost alone does not indicate capability parity with models like GPT-4 or Claude. The results likely apply to narrow domains or smaller model classes. Questions remain about HRM-Text's performance on diverse benchmarks and whether the $1,500 figure includes all infrastructure costs.

If validated, the work could democratize foundation model development. Smaller teams and resource-constrained organizations

Researchers say they trained a foundation model from scratch for about $1,500

Satya Nadella warns that AI could hollow out entire industries, echoing the damage done by globalization

When deep research isn't enough for your business: Sakana AI launches 'ultra deep research' agent for 100+ page reports in 8 hours

85% of IT teams claim every AI agent is under control. Only 42% actually know who owns them.

Get Daily AIWireDaily