
gravitino
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
The Lens
Apache Gravitino is an open source data catalog that gives you a unified metadata layer across many sources: Hive, MySQL, S3, HDFS, Iceberg, Lance, and more. Apache 2.0 and free. The pitch is federation: instead of copying metadata into a central catalog, it reflects each source live, so what you query is what's actually there.
It plugs into Trino and Spark as a query catalog, supports geo-distributed metadata syncing, and adds access control and auditing across your estate. Self-hosting is via Docker or binary; this is a real infrastructure piece, not a small daemon, and it earns its complexity once you have multiple data systems to govern together.
For solo developers or small teams with one Postgres and one S3 bucket, this is overkill: a Hive Metastore or your DB's native catalog is enough. Larger teams with mixed engines and regions are the audience. It positions itself as an open alternative to Databricks Unity Catalog and Snowflake Polaris.
The catch: you commit to running a metadata service that becomes load-bearing for query engines. If it goes down or drifts, queries fail in confusing ways. Solid Apache project work, but a federated catalog is not a small thing to operate; budget the SRE time before you adopt.
Free vs Self-Hosted vs Paid
fully freeFree: The full Apache project under Apache 2.0. All connectors, federation, and governance features.
Self-hosted: Docker or binary installation. You run it as a service that query engines like Trino and Spark connect to.
Paid: None from the project. Managed Gravitino offerings from third parties may emerge over time.
Completely free Apache project. No commercial tier from the project itself.
Get tools like this every Wednesday
One featured tool, three on the radar. No fluff.
Similar Tools
License: Apache License 2.0
Use freely. Patent grant included.
Commercial use: ✓ Yes
About
- Owner
- The Apache Software Foundation (Organization)
- Stars
- 3,006
- Forks
- 869
Explore Further
More tools in the directory
OpenMetadata
OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
14.2k ★starrocks
The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
11.8k ★



