Amazon is racing to reduce its AI costs before Anthropic switches to token-based pricing next year. The company's engineers are actively distilling Anthropic's Claude models into smaller, cheaper versions for internal use, according to reporting on the shift.

Currently, Amazon pays Anthropic based on compute hours. The new pricing model charges per token processed, a change that could significantly increase expenses. Token-based pricing aligns costs more directly with actual usage, making it harder for companies to optimize spending through efficiency alone.

The distillation strategy lets Amazon compress Anthropic's capabilities into leaner models that require less computational power to run. Smaller models consume fewer tokens, lowering per-query costs. This approach is common in the industry. Companies like Meta and Google have distilled their own large models for deployment at scale.

Amazon's move reflects tension between rapid AI adoption and rising operational costs. The company invested up to $4 billion in Anthropic and relies on Claude for various applications. Yet the pricing change creates financial pressure to act now, before token costs climb.

Amazon is not limiting itself to Anthropic. The company is also exploring alternatives like OpenAI's models, which could offer different pricing structures or performance characteristics. This evaluation suggests Amazon wants flexibility in its vendor selection to manage long-term costs.

The distillation work began before the pricing announcement, but the timing amplifies its urgency. Companies with heavy model usage face a critical window to optimize their infrastructure. Amazon's scale makes even small per-token savings multiply across millions of requests.

This pattern repeats across major cloud providers and enterprises. As AI model pricing stabilizes or increases, the business case for smaller, specialized models strengthens. Distillation trades training complexity for deployment efficiency, a tradeoff that pays off at Amazon's scale.

The shift underscores how AI economics drive infrastructure decisions. Pricing models shape which technologies companies