AI Weekly Issue #477: Jensen Huang says we've achieved AGI. The benchmarks say 0.37%.

Jensen Huang's claim that we have achieved artificial general intelligence collides hard with actual benchmark data. Frontier AI models score just 0.37% on ARC-AGI-3, a test that presents interactive environments with no predefined rules or goals. Humans solve these tasks at 100%. The gap reveals the core limitation of current AI: these systems excel at pattern matching within training data but fail entirely when facing novel situations without instruction.

This constraint directly shapes what AI can replace in practice today. Current architectures dominate standardized exams and text-based tasks because those fit their training distribution. Real-world work that requires adaptation, improvisation, or handling unexpected scenarios remains firmly human territory.

The AI infrastructure boom reflects this reality. This week alone saw $25 billion in deals targeting data pipelines and real-world applications rather than model development. IBM acquired Confluent for $11 billion to control real-time data streaming. Eli Lilly purchased Insilico's drug discovery pipelines for $2.75 billion. Physical Intelligence raised $1 billion for robot control systems. Building better language models has become table stakes. The defensible value now sits in owning the data flow between models and the physical world.

This inversion matters strategically. Companies that control how AI systems interact with actual business processes, manufacturing lines, healthcare data, or logistics networks have built competitive moats. A superior LLM becomes commoditized quickly. A proprietary pipeline connecting models to real-time decision-making stays defensible.

The ARC-AGI test exposes the hype-reality gap that investors are pricing in. AI Weekly's framing underscores what practitioners already know: we have powerful pattern-matching tools, not general intelligence. The next wave of AI value comes not from bigger models but from deeper integration into domain-specific workflows where data and expertise create lock-in. Infrastructure

AI Weekly Issue #477: Jensen Huang says we've achieved AGI. The benchmarks say 0.37%.

SAP: How enterprise AI governance secures profit margins

Per-token AI charges come to GitHub Copilot

Baidu's Ernie 5.1 cuts 94 percent of pre-training costs while competing with top models

Get Daily AIWireDaily