New
We're excited to announce the public release of Bifrost, the fastest, most scalable LLM gateway out there. We've engineered Bifrost specifically for high-throughput, production-grade AI systems, and we've optimized performance at every level:
🔸 ~0 heap allocation during live requests (configurable)
🔸 Actor pattern to avoid fetching the config at request time
🔸 Full use of Go’s concurrency primitives
🔸 Lightweight plugin system to keep the core minimal
🔸 Support for multiple transport protocols (HTTP, gRPC)
And it’s:
🔹 Open source
🔹 Written in pure Go (A+ code quality report)
🔹 40x lower overhead (based on LiteLLM’s published benchmarks)
🔹 9.5x faster, ~54x lower P99 latency, and uses 68% less memory than LiteLLM
🔹 Built-in Prometheus observability
🔹 Plugin store for easy extensibility
Check out our GitHub repo to get started. Read more about Bifrost benchmarks in our blog.
