
kreuzberg
A polyglot document intelligence framework with a Rust core. Extract text, metadata, images, and structured information from PDFs, Office documents, images, and 91+ formats. Available for Rust, Python, Ruby, Java, Go, PHP, Elixir, C#, R, C, TypeScript (Node/Bun/Wasm/Deno)- or use via CLI, REST API, or MCP server.
The Lens
Kreuzberg rips text, metadata, and structured data out of 91+ file formats. PDFs, Word docs, images, source code in 248 languages, you name it. The Rust core makes it fast, and bindings exist for Python, Node.js, Go, Ruby, Java, and C#. Completely open source under the Elastic License.
Deploying it is straightforward: Docker container, CLI binary, REST API, or even an MCP server for AI tool chains. The image is around 1.3GB because of OCR backends (Tesseract, PaddleOCR), but once it's running, it handles batch processing with configurable parallelism and streaming for large files.
Solo devs building document pipelines get immediate value. Teams doing search indexing or RAG will appreciate the format coverage, since most alternatives force you to stitch together multiple libraries. One tool that handles everything from scanned receipts to source code.
The catch: the Elastic License means you can't offer it as a managed service without a commercial agreement. Building an internal tool? You're fine. Reselling document extraction? Talk to their team first.
Free vs Self-Hosted vs Paid
source available**Free tier:** Full framework, all 91+ formats, all language bindings. Free for internal and non-competing use.
**Self-hosted:** Docker (~1.3GB image), CLI, or REST API. No licensing fees for internal use. OCR backends (Tesseract, PaddleOCR) included. Moderate compute requirements for batch processing.
**Paid tier:** Commercial license required if you offer document extraction as a hosted service that competes with Kreuzberg. Contact the team for pricing.
Free for internal use. Commercial license needed if you're reselling extraction as a service.
License: Other
Review license manually.
Commercial use: ✗ Restricted
About
- Owner
- Kreuzberg (Organization)
- Stars
- 7,550
- Forks
- 377
Explore Further
More tools in the directory
Get tools like this delivered weekly
The Open Source Drop — the best new open source tools, analyzed. Free.