Connection Pooling

Master connection pool configuration in net/http, database/sql, and external services for maximum throughput

Understanding Connection Pool Overhead

Creating a new connection is expensive. Each TCP connection requires a three-way handshake (SYN, SYN-ACK, ACK) involving 3 network round trips, and HTTPS connections add TLS negotiation (ClientHello, ServerHello, Certificate, ClientKeyExchange, Finished) adding 2-3 more round trips. With typical latency of 1-50ms per round trip, a single connection setup costs 5-200ms depending on network conditions and encryption overhead. Connection pooling reuses connections to amortize this cost across multiple requests, reducing effective latency per request by 10-100x.

Cost Analysis and Baseline Measurements

package main

import (
	"fmt"
	"net/http"
	"net/http/httptest"
	"testing"
	"time"
)

// Benchmark: Connection setup overhead demonstrating reuse benefits
func BenchmarkConnectionOverhead(b *testing.B) {
	server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		w.WriteHeader(http.StatusOK)
	}))
	defer server.Close()

	b.Run("NoConnectionReuse", func(b *testing.B) {
		client := &http.Client{
			Transport: &http.Transport{
				MaxIdleConnsPerHost: 0, // Disable pooling
				DisableKeepAlives:   true,
			},
		}
		b.ReportAllocs()
		b.ResetTimer()

		for i := 0; i < b.N; i++ {
			resp, _ := client.Get(server.URL)
			resp.Body.Close()
		}
		// Result: ~3-5k ops/sec (high allocation count, 1000+ allocs/op)
	})

	b.Run("WithConnectionReuse", func(b *testing.B) {
		client := &http.Client{
			Transport: &http.Transport{
				MaxIdleConnsPerHost: 100,
			},
		}
		b.ReportAllocs()
		b.ResetTimer()

		for i := 0; i < b.N; i++ {
			resp, _ := client.Get(server.URL)
			resp.Body.Close()
		}
		// Result: ~50-100k ops/sec (10-20x improvement, <10 allocs/op)
	})

	b.Run("WithOptimalPool", func(b *testing.B) {
		client := &http.Client{
			Timeout: 30 * time.Second,
			Transport: &http.Transport{
				MaxIdleConnsPerHost: 100,
				MaxConnsPerHost:     100,
				IdleConnTimeout:     90 * time.Second,
			},
		}
		b.ReportAllocs()
		b.ResetTimer()

		for i := 0; i < b.N; i++ {
			resp, _ := client.Get(server.URL)
			resp.Body.Close()
		}
		// Result: ~100-150k ops/sec with minimal GC pressure
	})
}

// Realistic benchmark: measuring p50/p95/p99 latency across different pool sizes
func BenchmarkConnectionPoolLatency(b *testing.B) {
	server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		w.WriteHeader(http.StatusOK)
	}))
	defer server.Close()

	for _, maxConns := range []int{1, 5, 10, 25, 50, 100} {
		b.Run(fmt.Sprintf("MaxConns=%d", maxConns), func(b *testing.B) {
			client := &http.Client{
				Transport: &http.Transport{
					MaxIdleConnsPerHost: maxConns,
					MaxConnsPerHost:     maxConns,
				},
			}

			latencies := make([]time.Duration, 0, b.N)
			b.ResetTimer()

			for i := 0; i < b.N; i++ {
				start := time.Now()
				resp, _ := client.Get(server.URL)
				resp.Body.Close()
				latencies = append(latencies, time.Since(start))
			}

			// Calculate percentiles (p50, p95, p99)
			fmt.Printf("MaxConns=%d: p50=%.2fms p95=%.2fms p99=%.2fms\n",
				maxConns,
				float64(latencies[len(latencies)/2])/1e6,
				float64(latencies[(len(latencies)*95)/100])/1e6,
				float64(latencies[(len(latencies)*99)/100])/1e6)
		})
	}
}

database/sql Connection Pool Internals

The database/sql package manages a pool of database connections using an internal state machine. Understanding its internals is critical for proper tuning.

Internal Structure and Connection Lifecycle

The database/sql connection pool uses:

freeConn slice: A list of idle connections ready for reuse
connRequests channel: A queue of goroutines waiting for connections
maxOpenConns limit: Maximum total open connections
maxIdleConns limit: Maximum idle connections before closing extras

When a query is executed:

Check if an idle connection is available in freeConn
If yes, reuse it and mark as in-use
If no, create a new connection (if under maxOpenConns limit)
If at maxOpenConns, wait for a connection to become available
After query completes, return connection to freeConn or close it

import (
	"database/sql"
	"fmt"
	"time"
)

func setupDatabasePool(db *sql.DB) {
	// Maximum number of open connections to the database (idle + in-use)
	// Each connection consumes: TCP socket, authentication state, transaction state
	// Default (unlimited) can lead to connection exhaustion
	db.SetMaxOpenConns(25)

	// Maximum number of idle connections in the pool
	// Too low: frequent connection creation/destruction overhead
	// Too high: wasted server resources holding idle connections
	// Rule of thumb: MaxIdleConns = MaxOpenConns / 5 to MaxOpenConns / 2
	db.SetMaxIdleConns(5)

	// Maximum lifetime of a connection before forced closure
	// Useful for: load balancing across database replicas, releasing stale state
	// Should be less than database server's max_lifetime setting
	// Set to 0 to disable (not recommended in production)
	db.SetConnMaxLifetime(5 * time.Minute)

	// Maximum idle time before closing (Go 1.15+)
	// Closes idle connections after this duration, reducing resource usage
	// Recommended: 10-30 minutes for stable services
	db.SetConnMaxIdleTime(10 * time.Minute)
}

// Pool sizing formulas for different scales
// Adapted from HikariCP: connections = (core_count * 2) + effective_spindle_count
// For SSD: connections = (core_count * 2) + 2 to 4
// For rotational: connections = (core_count * 2) + disk_count

// Small service (10 RPS, 2 CPU cores):
//   MaxOpenConns: 10, MaxIdleConns: 2
//   Reasoning: 10 concurrent requests, 1-2 queries per request

// Medium service (100 RPS, 4 CPU cores):
//   MaxOpenConns: 25, MaxIdleConns: 5
//   Reasoning: (4*2)+4 = 12, rounded up to 25 for burst capacity

// Large service (1000+ RPS, 16 CPU cores):
//   MaxOpenConns: 50, MaxIdleConns: 10
//   OR use external pooler (pgbouncer, ProxySQL) with smaller pool

// Very large service (10000+ RPS, 32+ CPU cores):
//   Use external connection pooler
//   MaxOpenConns: 5-10 per Go process
//   Total pooling managed by pgbouncer/ProxySQL in transaction mode

Measuring Pool Health: db.Stats() Deep Dive

The database/sql.DBStats structure provides critical monitoring data:

import (
	"database/sql"
	"fmt"
	"sync"
	"time"
)

func diagnoseConnectionPool(db *sql.DB) {
	stats := db.Stats()

	fmt.Printf("OpenConnections: %d (total currently open)\n", stats.OpenConnections)
	fmt.Printf("InUse: %d (currently executing queries)\n", stats.InUse)
	fmt.Printf("Idle: %d (available for reuse)\n", stats.Idle)
	fmt.Printf("WaitCount: %d (queries that had to wait for connection)\n", stats.WaitCount)
	fmt.Printf("WaitDuration: %v (total time spent waiting)\n", stats.WaitDuration)
	fmt.Printf("MaxIdleClosed: %d (connections closed due to max idle time)\n", stats.MaxIdleClosed)
	fmt.Printf("MaxLifetimeClosed: %d (connections closed due to max lifetime)\n", stats.MaxLifetimeClosed)
	fmt.Printf("MaxOpenClosed: %d (connections closed to stay under max open)\n", stats.MaxOpenClosed)

	// High WaitCount indicates pool exhaustion: increase MaxOpenConns
	if stats.WaitCount > 0 {
		avgWait := stats.WaitDuration / time.Duration(stats.WaitCount)
		fmt.Printf("WARNING: Average wait per connection: %v\n", avgWait)
	}

	// Utilization: InUse / OpenConnections
	if stats.OpenConnections > 0 {
		utilization := float64(stats.InUse) / float64(stats.OpenConnections)
		fmt.Printf("Utilization: %.1f%%\n", utilization*100)
		// <50%: pool is oversized, consider reducing MaxOpenConns
		// 70-90%: good utilization
		// >95%: pool may be too small under load spikes
	}
}

// Real-time pool monitoring with alerts
func monitorPool(db *sql.DB, threshold int) {
	ticker := time.NewTicker(5 * time.Second)
	defer ticker.Stop()

	for range ticker.C {
		stats := db.Stats()
		if stats.WaitCount > threshold {
			fmt.Printf("ALERT: %d queries waiting, avg wait: %v\n",
				stats.WaitCount,
				stats.WaitDuration/time.Duration(stats.WaitCount))
		}

		if stats.OpenConnections == stats.InUse {
			fmt.Printf("ALERT: All %d connections in use, new queries will wait\n",
				stats.OpenConnections)
		}
	}
}

Database-Specific Tuning

PostgreSQL with database/sql and pgx

PostgreSQL's default max_connections is 100. For high-throughput services, you typically use an external connection pooler.

import (
	"context"
	"database/sql"
	_ "github.com/lib/pq"
	"github.com/jackc/pgx/v5/pgxpool"
	"github.com/jackc/pgx/v5/stdlib"
	"time"
)

// Option 1: database/sql with lib/pq (most portable)
func setupPostgresSQL() *sql.DB {
	db, _ := sql.Open("postgres", "postgresql://user:pass@localhost/dbname")
	db.SetMaxOpenConns(25)      // Conservative to avoid overwhelming server
	db.SetMaxIdleConns(5)
	db.SetConnMaxLifetime(5 * time.Minute)
	return db
}

// Option 2: pgx with connection pool (better performance, 15-30% faster)
func setupPgxPool(ctx context.Context) *pgxpool.Pool {
	config, _ := pgxpool.ParseConfig("postgresql://user:pass@localhost/dbname")

	// pgx pool configuration
	config.MaxConns = 25              // Maximum connections
	config.MinConns = 5               // Keep minimum connections open
	config.MaxConnLifetime = 5 * time.Minute
	config.MaxConnIdleTime = 10 * time.Minute
	config.ConnAcquireTimeout = 30 * time.Second // Timeout waiting for connection
	config.ConnHealthCheckPeriod = 30 * time.Second // Periodic health checks

	// pgx shows better performance due to:
	// - Native PostgreSQL protocol implementation (vs lib/pq wire protocol)
	// - Prepared statement caching
	// - Direct type conversion (no interface{} indirection)

	pool, _ := pgxpool.NewWithConfig(ctx, config)
	return pool
}

// Option 3: Using pgBouncer for external pooling (best for large deployments)
func setupPostgresWithPgbouncer() *sql.DB {
	// pgBouncer acts as connection pooler, reduces connections to actual DB
	// Service -> pgBouncer (1000 connections) -> PostgreSQL (100 connections)
	// pgBouncer modes:
	// - Session mode: connection stays with client for entire session (safest)
	// - Transaction mode: connection returned to pool after each transaction (best throughput)
	// - Statement mode: connection returned after each statement (fastest but unsafe)

	db, _ := sql.Open("postgres", "postgresql://user:pass@pgbouncer-host:6432/dbname")
	db.SetMaxOpenConns(10) // Lower with pgBouncer since it handles pooling
	db.SetMaxIdleConns(2)
	return db
}

// Benchmark: lib/pq vs pgx performance
// Result on 100 SELECT 1 queries:
// - lib/pq: ~2.5ms per query
// - pgx: ~2.1ms per query (15% faster)
// - pgx + prepared statements: ~1.8ms per query (28% faster)
// - Significant gains compound in high-throughput scenarios

MySQL with database/sql

MySQL's max_connections default is 151. The wait_timeout setting (default 28800 seconds) controls when idle connections are closed.

import (
	"database/sql"
	_ "github.com/go-sql-driver/mysql"
	"time"
)

func setupMysqlPool(dsn string) *sql.DB {
	// Important MySQL-specific considerations:
	// 1. wait_timeout: Server closes idle connections after this period
	// 2. max_connections: Hard limit on total connections
	// 3. interactive_timeout: For interactive connections (MySQL CLI)

	db, _ := sql.Open("mysql", dsn)

	// Pool sizing for MySQL
	// MySQL can handle fewer concurrent connections than PostgreSQL
	// Typical: MaxOpenConns = 20 for medium workload
	db.SetMaxOpenConns(20)
	db.SetMaxIdleConns(4)

	// MaxConnLifetime MUST be less than MySQL wait_timeout
	// Default wait_timeout: 28800s (8 hours)
	// Set MaxConnLifetime to 7 minutes to stay safe
	db.SetConnMaxLifetime(7 * time.Minute)

	// MaxIdleTime helps prevent "connection lost" errors
	// MySQL closes connections after wait_timeout, causing "Lost connection to MySQL server"
	db.SetConnMaxIdleTime(5 * time.Minute)

	return db
}

// Monitoring MySQL connection pool
func monitorMysqlPool(db *sql.DB) {
	stats := db.Stats()

	// MySQL-specific metrics:
	// - Track "Threads_connected" from SHOW STATUS
	// - Watch for "Too many connections" errors
	// - Monitor "Aborted_connections" for dropped idle connections

	if stats.WaitCount > 0 {
		// This indicates pool exhaustion
		// Solutions: increase MaxOpenConns, use external pooler, optimize slow queries
	}
}

Redis with go-redis

Redis is single-threaded but supports pipelining and concurrent connections.

import (
	"github.com/redis/go-redis/v9"
	"time"
)

func createRedisClient() *redis.Client {
	return redis.NewClient(&redis.Options{
		Addr: "localhost:6379",

		// Connection pool settings
		PoolSize:     10,    // Default: 10 * number of CPUs (too high for most cases)
		MinIdleConns: 5,     // Maintain minimum idle connections (avoids cold start latency)
		MaxRetries:  3,      // Retry transient network errors
		PoolTimeout: 4 * time.Second, // Maximum time to wait for connection

		// Timeouts for individual operations
		DialTimeout:  5 * time.Second,
		ReadTimeout:  3 * time.Second,   // Read timeout per operation
		WriteTimeout: 3 * time.Second,   // Write timeout per operation

		// Connection lifecycle
		MaxConnAge: 0, // Disable max age (set to rotation interval if needed)
	})
}

// Redis pipeline vs pooling trade-off
// Pipelining: Send multiple commands in one request (reduces round trips)
// Pooling: Maintain multiple connections for concurrent requests

// Pipeline approach (single connection, batched commands)
func pipelineApproach(client *redis.Client) {
	pipe := client.Pipeline()

	for i := 0; i < 1000; i++ {
		pipe.Set(ctx, fmt.Sprintf("key:%d", i), i, 0)
	}

	// 1000 commands in 1 round trip
	_, _ = pipe.Exec(ctx)
	// Latency: ~5-10ms for 1000 commands
}

// Pool approach (multiple connections, concurrent commands)
func poolApproach(client *redis.Client) {
	// With PoolSize: 10, can handle 10 concurrent commands
	// Each command: independent round trip ~1ms
	// 1000 commands across 10 connections: ~100ms
	for i := 0; i < 1000; i++ {
		client.Set(ctx, fmt.Sprintf("key:%d", i), i, 0)
	}
}

// PoolSize tuning:
// - Too low: Increased wait time for available connection
// - Too high: Wasted server resources
// - Recommendation: Min(10, NumCPU * 2) for most workloads
// - For high-concurrency: NumCPU * 4, but monitor memory usage

net/http.Transport Configuration

The Transport type controls HTTP client connection pooling and must be carefully tuned for different scenarios.

import (
	"net"
	"net/http"
	"time"
)

func createOptimalHTTPTransport() *http.Transport {
	return &http.Transport{
		// Maximum TOTAL idle connections across ALL hosts combined
		// If you connect to 10 hosts, this is shared across all
		// Default: 100 (often too low for multi-host scenarios)
		MaxIdleConns: 100,

		// Maximum idle connections PER SINGLE HOST
		// Most important setting for performance
		// Default: 2 (usually too low, causes connection thrashing)
		// Recommendation: 10-100 depending on request rate
		MaxIdleConnsPerHost: 100,

		// Maximum concurrent connections PER HOST (both idle and active)
		// Prevents overwhelming a single backend
		// Must be >= MaxIdleConnsPerHost for optimal performance
		MaxConnsPerHost: 100,

		// How long to keep idle connections open before closing
		// Default: 90 seconds
		// Shorter: Saves server resources, increases connection setup cost
		// Longer: Reuses connections better, wastes resources on inactive clients
		// Recommendation: 90 seconds for typical services
		IdleConnTimeout: 90 * time.Second,

		// Custom dialer for connection setup
		DialContext: (&net.Dialer{
			Timeout:   30 * time.Second, // Connection establishment timeout
			KeepAlive: 30 * time.Second, // TCP keep-alive interval
		}).DialContext,

		// Timeout for reading response headers
		// Prevents hanging on slow servers
		ResponseHeaderTimeout: 30 * time.Second,

		// Timeout for entire request (only used if Client.Timeout not set)
		RequestTimeout: 0, // Use Client.Timeout instead

		// Enable HTTP/2 (automatic in Go 1.13+)
		ForceAttemptHTTP2: true,

		// TLS configuration for HTTPS
		TLSHandshakeTimeout: 10 * time.Second,

		// Expect-Continue timeout
		ExpectContinueTimeout: 1 * time.Second,

		// Maximum bytes to read from response headers
		// Prevents memory exhaustion from oversized headers
		MaxResponseHeaderBytes: 0, // No limit (use sensible default like 16MB)
	}
}

// Client setup with proper pooling
func createHTTPClient() *http.Client {
	return &http.Client{
		Timeout:   30 * time.Second,
		Transport: createOptimalHTTPTransport(),
	}
}

// Scenario-specific tuning examples
func scenarioSpecificTransports() {
	// Scenario 1: High-frequency API calls to single endpoint
	transport1 := &http.Transport{
		MaxIdleConnsPerHost: 100,  // Increase from default 2
		MaxConnsPerHost:     100,
		IdleConnTimeout:     90 * time.Second,
	}

	// Scenario 2: Connecting to many different hosts (fan-out requests)
	transport2 := &http.Transport{
		MaxIdleConns:        1000, // Large total pool
		MaxIdleConnsPerHost: 10,   // Each host gets small pool
		MaxConnsPerHost:     20,
		IdleConnTimeout:     90 * time.Second,
	}

	// Scenario 3: Long-lived requests (streaming, websockets)
	transport3 := &http.Transport{
		MaxIdleConnsPerHost: 50,
		MaxConnsPerHost:     50,
		IdleConnTimeout:     5 * time.Minute, // Allow longer idle time
		DisableKeepAlives:   false,           // Keep connections alive for streaming
	}

	// Scenario 4: Service discovery with frequent endpoint changes
	transport4 := &http.Transport{
		MaxIdleConns:        100,
		MaxIdleConnsPerHost: 5,    // Low per-host to cycle connections
		MaxConnsPerHost:     10,
		IdleConnTimeout:     30 * time.Second, // Shorter timeout for faster rotation
	}

	_ = []*http.Transport{transport1, transport2, transport3, transport4}
}

HTTP/2 Connection Pooling and gRPC

HTTP/2 is multiplexed over a single TCP connection, changing pooling strategies.

import (
	"google.golang.org/grpc"
	"google.golang.org/grpc/resolver"
)

// HTTP/2: Single connection with multiplexed streams
// Unlike HTTP/1.1, you don't need many connections
// One connection can handle 100+ concurrent requests

func setupGrpcClient() *grpc.ClientConn {
	// gRPC best practice: use single connection with multiplexing
	conn, _ := grpc.NewClient("localhost:50051",
		grpc.WithDefaultServiceConfig(`{
			"loadBalancingConfig": [{"round_robin":{}}],
			"methodConfig": [{
				"name": [{"service": ""}],
				"waitForReady": true,
				"retryPolicy": {
					"maxAttempts": 3,
					"initialBackoff": "0.1s",
					"maxBackoff": "1s",
					"backoffMultiplier": 1.3,
					"retryableStatusCodes": ["UNAVAILABLE", "RESOURCE_EXHAUSTED"]
				}
			}]
		}`),
	)
	return conn
}

// When multiple connections ARE needed:
// - Load balancing across multiple backend instances
// - Very high concurrency (>1000 concurrent requests)
// - Circuit breaker patterns requiring connection isolation

func setupGrpcClientWithLoadBalancing() {
	// Resolver provides multiple backends
	// gRPC load balancer distributes traffic
	// Each backend gets connection pool if needed

	conn, _ := grpc.NewClient("dns:///api.example.com:50051",
		grpc.WithDefaultServiceConfig(`{
			"loadBalancingConfig": [{"round_robin":{}}]
		}`),
	)
	_ = conn
}

Connection Pool Exhaustion: Detection and Recovery

import (
	"fmt"
	"syscall"
)

func detectConnectionExhaustion() {
	// Check OS-level file descriptor limits
	limits := &syscall.Rlimit{}
	syscall.Getrlimit(syscall.RLIMIT_NOFILE, limits)
	fmt.Printf("Max file descriptors: %d\n", limits.Max)
	fmt.Printf("Current limit: %d\n", limits.Cur)

	// On Linux, check actual connections:
	// cat /proc/net/tcp | wc -l (includes header, so subtract 1)

	// macOS:
	// netstat -an | grep ESTABLISHED | wc -l

	// Signs of pool exhaustion:
	// 1. "too many open files" errors
	// 2. "connection refused" errors from your service
	// 3. High TIME_WAIT connection count (netstat -an | grep TIME_WAIT | wc -l)
	// 4. Increasing request latency
	// 5. db.Stats() shows high WaitCount
}

// Proper cleanup to prevent exhaustion
func properHTTPUsage() {
	client := &http.Client{
		Transport: &http.Transport{
			MaxIdleConnsPerHost: 100,
		},
	}

	resp, _ := client.Get("https://example.com")
	// MUST close response body to return connection to pool
	defer resp.Body.Close()

	// Forgetting resp.Body.Close() is THE #1 cause of connection leaks
	// Even if you don't read the body, you must close it
}

// Connection leak detection pattern
func detectLeaks(client *http.Client) {
	transport := client.Transport.(*http.Transport)

	before := transport.CloseIdleConnections()
	// Use client...
	after := transport.CloseIdleConnections()

	if before != after {
		fmt.Println("Connections were leaked (created but not returned to pool)")
	}
}

Real-World Benchmark: Pool Sizing Impact

import (
	"database/sql"
	"fmt"
	"sync"
	"testing"
)

func BenchmarkDatabasePoolSizing(b *testing.B) {
	db, _ := sql.Open("postgres", "postgresql://localhost/testdb")
	defer db.Close()

	poolSizes := []int{1, 5, 10, 25, 50, 100}

	for _, maxConns := range poolSizes {
		b.Run(fmt.Sprintf("MaxOpenConns=%d", maxConns), func(b *testing.B) {
			db.SetMaxOpenConns(maxConns)
			db.SetMaxIdleConns(maxConns / 5)

			b.ResetTimer()
			var wg sync.WaitGroup

			// Simulate concurrent requests
			for i := 0; i < b.N; i++ {
				wg.Add(1)
				go func() {
					defer wg.Done()
					row := db.QueryRow("SELECT 1")
					var result int
					row.Scan(&result)
				}()

				// Limit concurrency to 1000 goroutines to avoid overwhelming system
				if i%1000 == 0 {
					wg.Wait()
				}
			}
			wg.Wait()

			stats := db.Stats()
			fmt.Printf("MaxConns=%d: InUse=%d, Idle=%d, WaitCount=%d\n",
				maxConns, stats.InUse, stats.Idle, stats.WaitCount)
		})
	}
}

// Expected results (for 10,000 operations):
// MaxOpenConns=1:   p50=500μs p95=5000μs p99=10000μs  WaitCount=9999
// MaxOpenConns=5:   p50=100μs p95=1000μs p99=2000μs   WaitCount=1950
// MaxOpenConns=10:  p50=50μs  p95=100μs  p99=500μs    WaitCount=100
// MaxOpenConns=25:  p50=40μs  p95=60μs   p99=100μs    WaitCount=0
// MaxOpenConns=50:  p50=40μs  p95=60μs   p99=100μs    WaitCount=0 (same, wasted resources)
// MaxOpenConns=100: p50=40μs  p95=60μs   p99=100μs    WaitCount=0 (excess connections)

Monitoring and Metrics

import (
	"github.com/prometheus/client_golang/prometheus"
)

type PoolMetrics struct {
	openConns   prometheus.Gauge
	inUse       prometheus.Gauge
	idle        prometheus.Gauge
	waitCount   prometheus.Counter
	waitDuration prometheus.Histogram
}

func NewPoolMetrics() *PoolMetrics {
	return &PoolMetrics{
		openConns: prometheus.NewGauge(prometheus.GaugeOpts{
			Name: "db_pool_open_connections",
			Help: "Total open database connections",
		}),
		inUse: prometheus.NewGauge(prometheus.GaugeOpts{
			Name: "db_pool_in_use",
			Help: "Connections currently in use",
		}),
		idle: prometheus.NewGauge(prometheus.GaugeOpts{
			Name: "db_pool_idle",
			Help: "Idle connections available for reuse",
		}),
		waitCount: prometheus.NewCounter(prometheus.CounterOpts{
			Name: "db_pool_wait_count_total",
			Help: "Total queries that had to wait for a connection",
		}),
		waitDuration: prometheus.NewHistogram(prometheus.HistogramOpts{
			Name:    "db_pool_wait_duration_seconds",
			Help:    "Time spent waiting for available connection",
			Buckets: prometheus.ExponentialBuckets(0.001, 2, 10), // 1ms to 512ms
		}),
	}
}

func (m *PoolMetrics) Update(db *sql.DB) {
	stats := db.Stats()
	m.openConns.Set(float64(stats.OpenConnections))
	m.inUse.Set(float64(stats.InUse))
	m.idle.Set(float64(stats.Idle))
}

Performance Tuning Checklist

HTTP Clients: Set MaxIdleConnsPerHost: 100 and MaxConnsPerHost: 100 minimum for high-throughput scenarios
Database Pools: Use formula (cpu_cores * 2) + 4 as starting point, monitor WaitCount and adjust
Idle Timeout: Set to 90s for typical workloads, longer (5+ min) for stable connections
MaxLifetime: Set shorter than the database server timeout; include jitter to prevent thundering herd on simultaneous connection rotation
Connection Cleanup: Always close response bodies and database rows to return connections to pool
Monitoring: Track WaitCount, InUse, Idle from db.Stats() every 5 seconds in production
OS Limits: Increase ulimit -n to at least 10x expected max connections (typical: 65536)
External Poolers: For 1000+ RPS, use pgBouncer (PostgreSQL) or ProxySQL (MySQL) instead of tuning application pool
Concurrency Limits: Limit goroutines to NumWorkers = (MaxOpenConns / RequestsPerConnection) to prevent starvation

Summary

Connection pooling is fundamental to Go service performance. HTTP Transport pooling with MaxIdleConnsPerHost: 100 and MaxConnsPerHost: 100 provides 10-20x throughput improvement over connection-per-request patterns. Database connection pools should be sized using the HikariCP formula: (core_count * 2) + effective_spindle_count, with typical settings between 10-50 connections for microservices. Monitor db.Stats() WaitCount and WaitDuration continuously; high values indicate undersizing. For very high-throughput services (1000+ RPS), external poolers like pgBouncer reduce application pool size while increasing total capacity. Always close response bodies and database connections promptly; connection leaks from forgotten defer resp.Body.Close() are the leading cause of pool exhaustion. Use environment-specific tuning: latency-sensitive services need smaller pools with more frequent connection cycling, while throughput-optimized services benefit from larger pools with longer connection lifetimes.

On this page