
Trino
Distributed SQL query engine for big data
Coldcast Lens
Trino is the distributed SQL engine that queries data where it lives — S3, Postgres, MySQL, Kafka, Elasticsearch — without moving it. Born as PrestoSQL (the original Presto creators left Facebook and took the project with them), it's now the most actively developed query engine for data lakes, with 3x the development velocity of Presto.
If you need interactive analytics across multiple data sources without building an ETL pipeline, Trino is the tool. Apache Spark handles batch processing and ML workloads better. PrestoDB (Meta's fork) is similar but slower-moving. DuckDB is the in-process alternative for single-machine analytics. Starburst is the commercial Trino distribution with enterprise support.
The catch: Trino is an interactive query engine, not a batch processor. Large joins can OOM your cluster because it keeps intermediate data in memory. Fault-tolerant execution mode exists but is newer and slower. And running a Trino cluster is real infrastructure — coordinators, workers, catalogs, and memory tuning. For most indie projects, DuckDB on a single machine is all you need.
About
- Stars
- 12,663
- Forks
- 3,546
Explore Further
More tools in the directory
Get tools like this delivered weekly
The Open Source Drop — the best new open source tools, analyzed. Free.