concept
SSRF Guard
SSRF Guard
A Server-Side Request Forgery guard is the layer that sits between
user-supplied URLs and any outbound HTTP your service makes. Without it,
customers can probe your internal network using your own API as a proxy: ask
for http://10.0.0.5/admin or http://169.254.169.254/latest/meta-data/iam/
and your service happily fetches them — from inside your VPC.
This is the non-negotiable layer in url-intel/overview|url-intel (and any future product that fetches user URLs).
The threat model
A user supplies a URL. Your service:
- Resolves the hostname → IP
- Opens a TCP connection
- Sends HTTP, follows redirects
- Returns the body (or a screenshot of it) to the user
Each step is an attack surface:
| Step | Attack |
|---|---|
| Hostname → IP | DNS rebinding: TTL=0 record resolves to 1.2.3.4 during validation, then to 127.0.0.1 on the actual dial |
| IP dial | Direct private IP (10.0.0.5), loopback (127.x), link-local (169.254.x), IPv6 ULA (fc00::/7), cloud metadata (169.254.169.254, fd00:ec2::254) |
| Redirects | Public URL → 302 → private URL; only validating the first hop misses this |
| Schemes | file:///etc/passwd, data:, gopher:// — used for in-process file reads and protocol smuggling |
The classic prize: AWS/GCP/Azure metadata at 169.254.169.254 returns IAM
credentials with whatever role the instance has. One unguarded URL fetch =
full cloud account takeover.
The guard, in layers
1. Pre-flight URL validation
✓ Parse with net/url
✓ Scheme ∈ {http, https}
✓ No userinfo (no `user:pass@` smuggling)
✓ Host present, no IDN homograph weirdness
✓ Port (if present) ∈ allowlist or in normal {80,443,...}
If any check fails, return 400 without ever resolving the hostname.
2. Custom dialer that re-resolves DNS and checks the IP
The standard http.Transport will not protect you. Replace its DialContext
with one that:
DialContext(ctx, network, addr) error {
host, port := SplitHostPort(addr)
ips := net.LookupIP(host)
for ip in ips:
if isBlocked(ip):
return ErrBlockedAddress
return net.Dialer{}.DialContext(ctx, network, JoinHostPort(ips[0], port))
}
Pin the dial to the specific IP you just validated — don’t let net.Dial
re-resolve and pick a different one (the DNS rebinding window).
3. Block list (treat conservatively — allow nothing private)
| Range | What it is |
|---|---|
127.0.0.0/8, ::1/128 | Loopback |
10.0.0.0/8 | RFC1918 private |
172.16.0.0/12 | RFC1918 private |
192.168.0.0/16 | RFC1918 private |
100.64.0.0/10 | Carrier-grade NAT |
169.254.0.0/16, fe80::/10 | Link-local (includes AWS/GCP/Azure metadata) |
fd00:ec2::/32 | AWS IPv6 metadata |
fc00::/7 | IPv6 Unique Local Address |
::ffff:0:0/96 | IPv4-mapped IPv6 (catch ::ffff:10.0.0.5) |
0.0.0.0/8, ::/128 | ”this network” / unspecified |
224.0.0.0/4, ff00::/8 | Multicast |
240.0.0.0/4 | Reserved future use |
For IPv6: also collapse to canonical form before checking. ::ffff:127.0.0.1
is loopback wearing an IPv6 hat.
4. Re-validate every redirect
net/http’s CheckRedirect runs before each follow. Re-validate the
destination URL through the same pre-flight + dialer guard. Cap total redirects
(5 is plenty). The dial-time IP check catches anything the URL-level check
misses.
5. Bounded resource limits
Per-request budget: total timeout (30s), max response bytes (cap before streaming further), max redirect count (5). SSRF often pairs with denial-of-service: a 100MB response to a tiny request is still an attack even if the target is public.
Why “just block private ranges” isn’t enough
- DNS rebinding defeats URL-level checks if you don’t pin the dial IP.
- IPv4-in-IPv6 (
::ffff:10.0.0.5) is private — make sure your check sees it. - CNAME tricks:
evil.comCNAME →internal.svc.cluster.local. Re-resolution catches this only if you check the resolved IP, not the hostname. - Redirect chains: public → public → public → private. If you only check the first hop, you’ve lost.
- Cloud metadata is link-local, not RFC1918. Many homemade guards miss it because they only ban 10/8 + 192.168.
Test matrix (mandatory in url-intel)
safeurl is the one module that must have thorough table-driven tests:
loopback v4 http://127.0.0.1/ → reject
loopback v6 http://[::1]/ → reject
RFC1918 10/8 http://10.0.0.5/ → reject
RFC1918 172.16/12 http://172.20.1.1/ → reject
RFC1918 192.168/16 http://192.168.1.1/ → reject
link-local v4 http://169.254.1.1/ → reject
AWS metadata http://169.254.169.254/ → reject
IPv6 ULA http://[fc00::1]/ → reject
IPv4-in-IPv6 loopback http://[::ffff:127.0.0.1] → reject
file scheme file:///etc/passwd → reject
data scheme data:text/plain,foo → reject
gopher scheme gopher://example.com/ → reject
redirect → private 302 to http://10.0.0.5 → reject
redirect chain 5 hops, last is private → reject
DNS rebind TTL=0 host flips after first lookup → reject (pin IP)
unicode host http://еxample.com/ → 400 or normalize
huge response 10GB stream → cap + reject
slow loris 1 byte/sec → timeout
Implementation in Go (sketch)
func NewSafeTransport() *http.Transport {
return &http.Transport{
DialContext: guardedDialContext,
TLSHandshakeTimeout: 5 * time.Second,
ResponseHeaderTimeout: 10 * time.Second,
DisableKeepAlives: true, // simpler reasoning per request
}
}
func NewSafeClient(timeout time.Duration) *http.Client {
return &http.Client{
Transport: NewSafeTransport(),
Timeout: timeout,
CheckRedirect: func(req *http.Request, via []*http.Request) error {
if len(via) >= 5 {
return errors.New("safeurl: too many redirects")
}
return Validate(req.URL) // same pre-flight as the entry URL
},
}
}
chromedp (headless Chrome) needs its own equivalent: hook into the network
domain via network.SetRequestInterception or set an HTTP proxy that itself
enforces the guard. Letting Chrome dial freely defeats the whole layer.
When this matters
Any service that fetches user-supplied URLs:
- URL preview / link-unfurl services (Slack, Discord, …)
- Screenshot / PDF / scraping APIs (this is url-intel/overview)
- OG-image fetchers for social cards
- Webhook receivers that follow callback URLs
- Importers (RSS readers, blog migrators)
- AI agents with a “fetch URL” tool — easily missed because it’s “just a tool”
It does not matter if you only fetch URLs you control. The instant a user can influence the URL, you need the guard.
Related
- url-intel/overview — the canonical implementation site
- headless-browser-pool — pairs with this: the renderer needs guarded dials too
- marketplace-distribution — when you sell URL-fetching APIs, the guard is the product’s licence to operate; one news article about your service being used for internal probing and you’re delisted