Baidu has developed "Unlimited OCR," a document recognition system that processes dozens of pages in a single pass, roughly tripling the capacity of existing optical character recognition models. Previous systems maxed out around ten pages before running into memory constraints.
The breakthrough centers on a modified attention mechanism that keeps memory consumption flat regardless of how many pages the model processes. This architectural innovation mimics how human memory works by selectively retaining important information while discarding less relevant data, preventing the exponential memory growth that typically limits sequence length in transformer-based models.
The system currently ranks first on major OCR benchmarks, a significant achievement in a field where accuracy and efficiency directly impact document processing workflows across industries. For enterprises handling large document batches, this represents a practical improvement. Processing dozens of pages without reloading or chunking documents saves computational overhead and reduces latency.
The technology addresses a real bottleneck in document digitization. Most OCR systems require splitting long documents into smaller sections, processing each separately, then stitching results together. This approach introduces complexity, potential alignment errors, and redundant computations. Unlimited OCR eliminates this workflow friction.
Baidu's approach treats the OCR task as a sequence problem where the model learns which contextual information matters most. By implementing selective memory dropout similar to human forgetting, the system maintains focus on task-relevant text while pruning redundant context. This differs from standard attention mechanisms that expand memory requirements linearly with sequence length.
The practical implications extend beyond speed. Multi-page processing in one pass allows the model to maintain cross-page context, potentially improving accuracy on documents where meaning spans multiple pages. This becomes valuable for processing contracts, technical manuals, and archival documents where information continuity matters.
Baidu has published this work, making the architecture available for research and potential deployment. The competitive landscape for OCR remains crowded, but a threef
