Maarten Grootendorst, creator of BERTopic and a developer relations engineer at Google DeepMind, argues that foundational techniques like embeddings and topic modeling remain essential tools even as large language models dominate AI conversations.
In a discussion with O'Reilly's Ben Lorica, Grootendorst emphasizes building practitioner intuition over prompt engineering tricks. His core argument centers on agentic systems, which he frames as fundamentally simple architectures wrapped in marketing language. Rather than treating agents as revolutionary technology, he positions them as logical extensions of existing patterns in AI development.
Embeddings continue to power semantic search, retrieval-augmented generation, and clustering tasks that LLMs alone cannot efficiently perform. Grootendorst notes that topic modeling, despite predictions of its obsolescence, solves real problems in document understanding and unsupervised learning that remain relevant at scale. These classical techniques solve specific constraints: they're computationally efficient, interpretable, and require less data than training large models from scratch.
The distinction matters for practitioners building production systems. Chasing the latest LLM release often means ignoring proven methods that cost less and fail more predictably. Grootendorst's perspective cuts through hype by treating AI engineering as an engineering problem rather than a technology race.
His work at Google DeepMind reflects this pragmatic approach. BERTopic, which combines BERT embeddings with topic modeling, demonstrates how classical and modern techniques combine effectively. The tool remains widely used because it solves a genuine problem efficiently.
The interview signals a shift in how technical leaders discuss AI maturity. Rather than asking "what's the newest model," the conversation turns to "what actually works for this problem." Embeddings and topic models answer that question for entire classes of applications. Agentic systems, stripped of buzzword coating, operate on
