CDX-301f · Module 1
Advanced DNS & Network Debugging
3 min read
Network debugging inside a Codex microVM requires understanding the layered enforcement model. When a connection fails, the failure could originate at any of four layers: DNS resolution (NXDOMAIN from the host resolver), TCP connection (iptables DROP or REJECT on the host TAP), TLS handshake (certificate pinning or SNI filtering), or HTTP response (proxy returns 403 Forbidden). Each layer produces a different error signature, and misdiagnosing the layer wastes debugging time.
The debugging sequence is bottom-up. First, check DNS: can the domain resolve? If NXDOMAIN, the domain is not on the allowlist — add it or mock it. Second, check TCP: does the connection establish? If connection refused or timeout, check iptables rules on the host. Third, check TLS: does the handshake complete? Certificate issues or SNI mismatches appear here. Fourth, check HTTP: does the proxy allow the request? A 403 from the proxy means the domain is allowlisted but the specific request is blocked by a more granular rule.
# Network debugging inside a Codex microVM task
# Layer 1: DNS resolution
nslookup api.example.com
# NXDOMAIN → domain not allowlisted
# Valid IP → DNS works, check next layer
# Layer 2: TCP connection
nc -zv api.example.com 443
# Connection refused → iptables blocking
# Connection timed out → TAP/routing issue
# Connected → TCP works, check TLS
# Layer 3: TLS handshake
openssl s_client -connect api.example.com:443 -servername api.example.com
# Verify return code: 0 (ok) → TLS works
# Certificate errors → proxy MITM or expired cert
# Layer 4: HTTP response
curl -v https://api.example.com/health
# 200 → fully working
# 403 → proxy blocking the request
# Connection reset → mid-stream filtering
Do This
- Debug network failures bottom-up: DNS → TCP → TLS → HTTP
- Include diagnostic commands (nslookup, nc, curl -v) in your AGENTS.md as allowed tools
- Log the exact error message — "connection refused" and "NXDOMAIN" have completely different fixes
Avoid This
- Assume all network failures are "the service is down" — most are policy enforcement
- Add broad allowlist entries to fix network errors without understanding which layer failed
- Debug TLS or HTTP issues when the domain does not even resolve — start at DNS