
Use the load balancing power of Endpoint Pools for canary deploys and beyond
We just shipped dead-simple Endpoint Pools, and while load-balancing between any number of replicas of a service in minutes is pretty darn cool, it's one of the many ways you can use the concept.
Today, let's breeze through a handful more—on the docket we have:
- Migrate from Agent to Cloud Endpoints and implement the 'front door' pattern
- Derisk changing your gateway configurations with canary tests
- Deploy canary versions of entire APIs or apps
Migrate from Agent to Cloud Endpoints and the "front door" pattern
Let's start here, because if you've been using ngrok in the past to expose a service directly to the public URL, implementing the "front door" pattern is the next best step in making your ingress more composable, manageable, and production-grade.
What's that pattern? A Cloud Endpoint on a public URL forwards traffic to Internal Agent Endpoints using the forward-internal
Traffic Policy action.

Why would you want to do that? When you loosely couple your gateway configuration and your ngrok agents, you can:
- Apply authentication, security policy, rate limiting, and more to a cloud endpoint once and apply it to all your services
- Use different Traffic Policy rules for each ngrok agent and the services they route traffic to
- Do cool things with Endpoint Pools, which I promise to show off in the next sections!
Make the migration
Assuming you already have an ngrok agent running, stop it, add the --pooling-enabled
flag to the CLI, and start it again. That creates but a moment of potential downtime.
ngrok http $PORT --url $YOUR_URL --pooling-enabled
In another terminal, start a second ngrok agent to the same service, but this time, on an internal agent endpoint, like https://service.internal
.
ngrok http $PORT --url https://service.internal
Next, jump over to the Endpoints section of your ngrok dashboard, click + New Endpoint, click Cloud Endpoint and select the same URL as your agent endpoint, click Enable pooling, and then click Create Cloud Endpoint. In the Traffic Policy editor, select everything and replace it with:
on_http_request:
- actions:
- type: forward-internal
config:
url: https://service.internal
And your Endpoints page should look something like this:

Now when you hit your URL in the browser or with curl, your traffic is load-balanced between the cloud endpoint and agent endpoint, which are in the same pool. You can now close your agent endpoint—the one running on $YOUR_URL
—without any more interruptions. Migration complete, with minimal mess and tons of new potential.
Including...
Derisk your gateway configuration changes
No matter which provider you might be using for your API/app gateway (I hope it's ngrok, but I get it!), one of the scariest things you'll ever do is change its configuration. One mistake means jamming that "front door" between your users and your services, and it often doesn't just affect a single service or end-user experience.
Endpoint Pools let you derisk that process and give you a path to roll back instantly.
You will need the front door pattern to try this (see the section above!), as you need your gateway configuration loosely coupled with your ngrok agents.
Copy your existing Traffic Policy configuration, create a new cloud endpoint on the same URL as your existing one, and paste in the YAML. Because ngrok will load-balance between these two cloud endpoints, each of which can have different Traffic Policy rules, this second one becomes your canary testbed.
Add a new Traffic Policy, like blocking bots with the req.user_agent.is_bot
variable:
on_http_request:
- expressions:
- "req.user_agent.is_bot"
actions:
- type: deny
- actions:
- type: forward-internal
config:
url: https://service.internal
Hit your endpoint a few times with curl and the worst offender of unnecessary AI bot traffic: GPTBot.
curl -A "GPTBot" $YOUR_URL
If you see that roughly half your requests are denied—remember that ngrok is distributing your requests among two cloud endpoints and two Traffic Policy rules, only one of which is actively denying bot traffic—then you know your changes are ready for primetime. All you need to do is delete the old cloud endpoint.
Deploy canary versions of your APIs and apps
The same idea applies to your upstream services. If you add different versions to the same pool, no matter where they run, ngrok load-balances between them. That lets you deploy major changes more safely than hard cut-overs, and if you start to realize something is wrong with your new deployment, you close down the second endpoint to roll everything back to the last working state.
You have a few ways of doing these per-service canary deployments.
First, you can use public agent endpoints. Just run them on different ports on the same machine, or different machines entirely, to test them with equal measures.
# v1
ngrok http 8080 --url $YOUR_URL --pooling-enabled
# v2
ngrok http 8081 --url $YOUR_URL --pooling-enabled
Second, use the front door pattern, where a cloud endpoint forwards to pooled replicas of https://service.internal
.
# v1
ngrok http 8080 --url https://service.internal --pooling-enabled
# v2
ngrok http 8081 --url https://service.internal --pooling-enabled
Third—sorry, but you'll have to wait for the custom strategies drop. If you want access before it goes GA, jump over to the Early Access page in the ngrok dash and find Traffic Policy - Advanced Load Balancing.
What else are pools good for?
As summer bears down on us in the northern hemisphere (doubly so for me in Arizona), plenty.
Oh, right, endpoint pools... well, here's where you come in. Once you've had time to play around with pooling, jump into our Discord server and let us know what you're up to and how it works. We want to hear success stories, product feedback, papercuts, and everything in between.
Until then, check out our pooling resources:
- Docs: Endpoint Pools
- YouTube video: Load balance anything, anywhere with ngrok's Endpoint Pools
- Guide: Load Balancing Between Multiple Kubernetes Clusters
- Another guide: Load Balancing Between Services Deployed in Kubernetes
- And yet another: Load Balancing Between Multiple Clouds