DNS Performance
Optimize DNS resolution in Go, implement caching strategies, and reduce lookup latency in production services
Understanding DNS Resolution Overhead
DNS lookups convert domain names (example.com) into IP addresses. Each lookup involves querying nameservers, adding 10-200ms of latency. In a service making thousands of requests, DNS overhead compounds significantly.
DNS Lookup Latency
package main
import (
"fmt"
"net"
"time"
)
func measureDNSLatency() {
hosts := []string{
"example.com",
"google.com",
"cloudflare.com",
}
resolver := &net.Resolver{}
for _, host := range hosts {
start := time.Now()
ips, _ := resolver.LookupIPAddr(context.Background(), host)
latency := time.Since(start)
fmt.Printf("%s: %v (%v)\n", host, ips, latency)
}
// Typical results:
// example.com: 25ms
// google.com: 15ms
// cloudflare.com: 30ms
// Average: 20-50ms per lookup
}
// Benchmark: DNS lookup cost
func BenchmarkDNSLookup(b *testing.B) {
for i := 0; i < b.N; i++ {
net.LookupIP("example.com")
}
// Result: ~50k ops/sec
// Cost: ~20 microseconds, but network adds 20-50ms latency!
}Go's DNS Resolution: Pure Go vs CGO
Go offers two DNS resolver implementations:
Pure Go Resolver (Default)
The pure Go resolver runs in the same process, using UDP queries directly.
import (
"net"
"os"
)
func usePureGoResolver() {
// Enable pure Go resolver
os.Setenv("GODEBUG", "netdns=go")
// Or force it in code
net.DefaultResolver = &net.Resolver{
PreferGo: true,
}
// Pure Go resolver characteristics:
// Pros:
// - No CGO overhead (faster, no cgo locks)
// - Fully async, non-blocking
// - Can handle DNS caching in-process
//
// Cons:
// - Limited timeout control for system resolver
// - May not honor /etc/hosts consistently
// - No support for custom DNS servers via system config
}CGO Resolver (System)
The CGO resolver uses the system's libc resolver.
func useCGOResolver() {
// Force CGO resolver
os.Setenv("GODEBUG", "netdns=cgo")
// Or:
net.DefaultResolver = &net.Resolver{
PreferGo: false,
}
// CGO resolver characteristics:
// Pros:
// - Uses system resolver configuration
// - Respects /etc/hosts
// - Supports IPv6 consistently
//
// Cons:
// - CGO overhead (function call marshalling)
// - Thread blocking during lookup (impacts goroutine scheduling)
// - Slower for high-concurrency workloads
}Benchmark: Pure Go vs CGO
import (
"context"
"net"
"testing"
"time"
)
func BenchmarkDNSResolvers(b *testing.B) {
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
b.Run("PureGo", func(b *testing.B) {
resolver := &net.Resolver{PreferGo: true}
b.ResetTimer()
for i := 0; i < b.N; i++ {
resolver.LookupIPAddr(ctx, "example.com")
}
// Result: ~100k ops/sec (concurrent requests not blocked)
})
b.Run("CGO", func(b *testing.B) {
resolver := &net.Resolver{PreferGo: false}
b.ResetTimer()
for i := 0; i < b.N; i++ {
resolver.LookupIPAddr(ctx, "example.com")
}
// Result: ~50k ops/sec (calls block goroutines)
})
}Use the pure Go resolver (
PreferGo: true) for high-concurrency services. It avoids CGO overhead and allows goroutines to continue running during DNS lookups.
net.Resolver Configuration
Customize DNS behavior with net.Resolver:
import (
"context"
"net"
"time"
)
func createCustomResolver() *net.Resolver {
return &net.Resolver{
PreferGo: true, // Use pure Go resolver
// Dial function for custom DNS server
Dial: func(ctx context.Context, network, address string) (net.Conn, error) {
dialer := &net.Dialer{
Timeout: 5 * time.Second,
// KeepAlive for DNS connections
KeepAlive: 30 * time.Second,
}
return dialer.DialContext(ctx, network, address)
},
}
}
func resolveWithCustomSettings() error {
resolver := createCustomResolver()
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
ips, err := resolver.LookupIPAddr(ctx, "example.com")
if err != nil {
return err
}
// Use ips...
return nil
}DNS Caching: Go Doesn't Do It By Default
Go does not cache DNS responses by default. Each lookup queries the system resolver (or DNS server). Implementing application-level caching is essential for high-throughput services.
Simple In-Memory DNS Cache
import (
"net"
"sync"
"time"
)
type DNSCache struct {
cache map[string]*DNSEntry
mu sync.RWMutex
}
type DNSEntry struct {
IPs []net.IP
ExpiresAt time.Time
}
func (c *DNSCache) Resolve(host string) ([]net.IP, error) {
c.mu.RLock()
entry, ok := c.cache[host]
c.mu.RUnlock()
// Check if cached entry is still valid
if ok && entry.ExpiresAt.After(time.Now()) {
return entry.IPs, nil
}
// Perform actual DNS lookup
ips, err := net.LookupIP(host)
if err != nil {
return nil, err
}
// Cache with TTL
c.mu.Lock()
c.cache[host] = &DNSEntry{
IPs: ips,
ExpiresAt: time.Now().Add(5 * time.Minute),
}
c.mu.Unlock()
return ips, nil
}
// Usage
func main() {
cache := &DNSCache{cache: make(map[string]*DNSEntry)}
// First call: DNS lookup (~50ms)
ips, _ := cache.Resolve("example.com")
// Second call: Cache hit (<1ms)
ips, _ = cache.Resolve("example.com")
}TTL-Aware Caching
DNS responses include TTL (Time To Live), indicating how long to cache:
import (
"net"
"context"
)
type TTLAwareDNSCache struct {
resolver *net.Resolver
cache map[string]*CachedDNS
mu sync.RWMutex
}
type CachedDNS struct {
IPs []string
TTL time.Duration
CachedAt time.Time
}
func (c *TTLAwareDNSCache) LookupIP(ctx context.Context, host string) ([]string, error) {
c.mu.RLock()
if entry, ok := c.cache[host]; ok {
if time.Since(entry.CachedAt) < entry.TTL {
defer c.mu.RUnlock()
return entry.IPs, nil
}
}
c.mu.RUnlock()
// Lookup MX records to get TTL
mxRecords, _ := c.resolver.LookupMX(ctx, host)
var ips []string
for _, mx := range mxRecords {
aRecords, _ := c.resolver.LookupHost(ctx, mx.Host)
ips = append(ips, aRecords...)
}
// For simplicity, use conservative TTL
ttl := 5 * time.Minute
c.mu.Lock()
c.cache[host] = &CachedDNS{
IPs: ips,
TTL: ttl,
CachedAt: time.Now(),
}
c.mu.Unlock()
return ips, nil
}DNS Prefetching
Resolve DNS before you need it:
import (
"net"
"time"
)
func prefetchDNS(hosts []string) map[string][]net.IP {
resolver := &net.Resolver{PreferGo: true}
results := make(map[string][]net.IP)
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
for _, host := range hosts {
ips, _ := resolver.LookupIPAddr(ctx, host)
results[host] = ips
}
return results
}
func init() {
// Prefetch critical hosts on startup
criticalHosts := []string{
"api.example.com",
"db.example.com",
"cache.example.com",
}
prefetchedIPs := prefetchDNS(criticalHosts)
// Use prefetchedIPs in requests
}dnscache Libraries
Popular libraries provide production-ready DNS caching:
Using Dgraph's ristretto (Fast cache)
import (
"github.com/dgraph-io/ristretto"
"net"
)
func setupRistrettoCache() {
cache, _ := ristretto.NewCache(&ristretto.Config{
NumCounters: 1e7, // 10M entries
MaxCost: 1 << 30, // 1GB
BufferItems: 64,
})
// Cache DNS lookups
resolver := &net.Resolver{}
cachedLookup := func(host string) ([]net.IP, error) {
if val, found := cache.Get(host); found {
return val.([]net.IP), nil
}
ips, err := resolver.LookupIP(context.Background(), "ip", host)
if err == nil {
cache.Set(host, ips, 100) // Weight: 100
}
return ips, err
}
_ = cachedLookup
}Third-party DNS Cache Libraries
import (
"github.com/miekg/dns"
)
// dnscache: Simple DNS caching library
// github.com/jackc/pgx/v4/stdlib has built-in DNS caching
// coredns: Full-featured DNS resolver with caching
// Can be embedded or run as separate serviceDNS-over-HTTPS (DoH) Impact
DoH encrypts DNS queries over HTTPS, adding overhead but improving privacy:
import (
"net"
"net/http"
)
func createDoHResolver() *net.Resolver {
// Using Cloudflare DoH
return &net.Resolver{
Dial: func(ctx context.Context, network, address string) (net.Conn, error) {
// DoH queries go over HTTP/2 (faster than UDP+TCP)
// But adds TLS handshake on first request
// Caching becomes even more critical
return nil, nil
},
}
}
// DoH Performance Impact:
// Traditional DNS (UDP): 10-50ms
// DNS over HTTPS: 50-200ms (includes TLS)
//
// DoH is slower but:
// - Privacy from ISP
// - Works through restrictive firewalls
// - Benefits from HTTP/2 connection pooling
//
// Recommendation: Use DoH for privacy-critical services,
// but cache aggressively to avoid latencyMeasuring DNS Resolution Time
import (
"fmt"
"net"
"sync"
"time"
)
type DNSMetrics struct {
lookups int64
totalLatency time.Duration
maxLatency time.Duration
cacheHits int64
cacheMisses int64
mu sync.Mutex
}
func (m *DNSMetrics) recordLookup(latency time.Duration, hit bool) {
m.mu.Lock()
defer m.mu.Unlock()
m.lookups++
m.totalLatency += latency
if latency > m.maxLatency {
m.maxLatency = latency
}
if hit {
m.cacheHits++
} else {
m.cacheMisses++
}
}
func (m *DNSMetrics) String() string {
m.mu.Lock()
defer m.mu.Unlock()
avgLatency := time.Duration(0)
if m.lookups > 0 {
avgLatency = m.totalLatency / time.Duration(m.lookups)
}
hitRate := float64(0)
if m.lookups > 0 {
hitRate = float64(m.cacheHits) / float64(m.lookups) * 100
}
return fmt.Sprintf(
"Lookups: %d, Avg: %v, Max: %v, Hit Rate: %.1f%%",
m.lookups, avgLatency, m.maxLatency, hitRate,
)
}Reducing DNS Lookups in High-Throughput Services
Strategy 1: Batch Operations
func batchLookups(hosts []string) map[string][]net.IP {
resolver := &net.Resolver{PreferGo: true}
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
results := make(map[string][]net.IP)
var wg sync.WaitGroup
for _, host := range hosts {
wg.Add(1)
go func(h string) {
defer wg.Done()
if ips, err := resolver.LookupIPAddr(ctx, h); err == nil {
results[h] = ips
}
}(host)
}
wg.Wait()
return results
}Strategy 2: Use IP Addresses Directly
import "net"
// Instead of:
http.Get("https://api.example.com/endpoint")
// Use IP directly if known:
client := &http.Client{
Transport: &http.Transport{
// Custom dialer with predefined IPs
Dial: func(network, addr string) (net.Conn, error) {
// Map known hosts to IPs
if addr == "api.example.com:443" {
addr = "192.0.2.1:443"
}
return net.Dial(network, addr)
},
},
}Strategy 3: Connection Pool Per Endpoint
type EndpointPool struct {
hosts []string
pools map[string]*net.TCPConn
mu sync.RWMutex
cache *DNSCache
}
func (p *EndpointPool) GetConnection(host string) (net.Conn, error) {
// Lookup once, reuse connection
ips, _ := p.cache.Resolve(host)
// Use first IP from cache
return net.Dial("tcp", ips[0].String())
}Production Configuration Example
import (
"context"
"net"
"net/http"
"time"
)
func setupProductionDNS() *http.Client {
resolver := &net.Resolver{
PreferGo: true, // Pure Go for concurrency
Dial: func(ctx context.Context, network, address string) (net.Conn, error) {
dialer := &net.Dialer{
Timeout: 5 * time.Second,
KeepAlive: 30 * time.Second,
}
return dialer.DialContext(ctx, network, address)
},
}
return &http.Client{
Timeout: 30 * time.Second,
Transport: &http.Transport{
MaxIdleConns: 100,
MaxIdleConnsPerHost: 100,
DialContext: (&net.Dialer{
Timeout: 10 * time.Second,
KeepAlive: 30 * time.Second,
Resolver: resolver,
}).DialContext,
},
}
}Performance Tuning Checklist
- Use Pure Go Resolver: Set
PreferGo: truefor concurrent services - Implement Caching: Cache DNS responses with appropriate TTLs (5-10 minutes typical)
- Prefetch Critical Hosts: Resolve essential endpoints on startup
- Monitor Metrics: Track lookup latency, cache hit rate, slowest hosts
- Batch Lookups: Use concurrent resolution for multiple hosts
- Set Timeouts: 5-10 second timeout prevents hanging on broken DNS
- Connection Pooling: Reuse connections to eliminate repeated lookups
Summary
DNS resolution adds 10-200ms of latency per lookup. Go's pure Go resolver is optimal for concurrent services, avoiding CGO overhead. Application-level caching is essential since Go doesn't cache by default. Implement TTL-aware caching with 5-10 minute lifetimes, prefetch critical hosts, and batch resolution operations for best performance. Monitoring DNS metrics helps identify and optimize slow resolvers. For services making thousands of requests, DNS optimization can reduce latency by 10-50%.