
HIKYAKU
The privacy-first AI Gateway
Your prompts, delivered.
One endpoint between your AI applications and any number of model backends — OpenAI, Anthropic, on-prem vLLM, Ollama, multimodal, speech, vision, embeddings. Routing, failover, prefix-cache-aware affinity, traffic-driven health checks. The operational machinery to run inference infrastructure in production.
Written in Go. Ships as a 7MB scratch container. Non-root, no shell, no libc. Configured via YAML, hot-reloaded via SIGHUP.
Three things make it different
Privacy as architecture
Hardened builds elide prompt-handling code paths at compile time. Capture is opt-in and bounded — never on by default.
Go-native performance
Several thousand concurrent requests on a 2-core machine. Memory bounds under sustained agentic load. Certified performance report per release.
Built to global banking privacy standards
Designed against the audit bar serious financial institutions are held to. Not retrofitted to it.
The name comes from the historical Japanese courier system — the hikyaku (飛脚) ran messages across the country, fast and reliably, without reading them. A courier delivers. A courier doesn’t make a copy for the archive on the way through.
Be notified
Drop your email. We’ll send one note when there’s something worth telling you about. Not before.
OSS core → github.com/wentbackward/hikyaku