Master syscalls, bufio package, optimal buffer sizes, and efficient file I/O patterns

Buffered I/O in Go

Buffered I/O is one of the most impactful optimizations for I/O-heavy applications. The performance difference between unbuffered and buffered I/O can be orders of magnitude. This article dives deep into why, and how to use Go's buffered I/O effectively.

Why Syscalls Are Expensive

Every unbuffered read or write operation triggers a system call, which is an expensive context switch between user space and kernel space. This context switch involves:

CPU state switching - The kernel saves the current CPU state and loads its own
Mode transitions - Switching from user mode to kernel mode
Memory translation - TLB flushes for address space translation
Instruction pipeline disruption - Modern CPUs lose optimization work

A single syscall can cost 100-1000+ CPU cycles. When you're reading a file byte-by-byte without buffering, you're making a syscall for each byte—absolutely devastating for performance.

Buffering solves this by accumulating many bytes in memory before making a single syscall. Instead of 1,000,000 syscalls to read 1MB, you might make just 1-2 syscalls.

bufio.Reader and bufio.Writer

The bufio package provides buffered wrappers around io.Reader and io.Writer interfaces. These are the primary tools for reducing syscalls.

Basic Usage

package main

import (
	"bufio"
	"fmt"
	"os"
)

func main() {
	// Unbuffered read (bad for many small reads)
	file, _ := os.Open("data.txt")
	defer file.Close()

	// Buffered read (much better!)
	reader := bufio.NewReader(file)
	line, _ := reader.ReadString('\n')
	fmt.Println(line)
}

bufio.NewReader() wraps an io.Reader with a default 4096-byte buffer. bufio.NewWriter() does the same for io.Writer.

Optimal Buffer Sizes

The default 4KB buffer works well for most scenarios, but optimal buffer size depends on your I/O pattern and hardware:

SSD sequential reads: 64KB-512KB buffers can be beneficial
HDD access: Larger buffers (256KB+) reduce seeks
Network I/O: 32KB-64KB often works well
Small objects: 4KB may be inefficient due to overhead

You can customize buffer size:

// Create a reader with 64KB buffer
reader := bufio.NewReaderSize(file, 64*1024)

// Create a writer with 256KB buffer
writer := bufio.NewWriterSize(file, 256*1024)

Tip: Benchmark your specific workload. There's no universal "optimal" size—it depends on your access patterns and hardware characteristics.

bufio.Scanner for Line-by-Line Reading

bufio.Scanner is convenient for reading line-by-line and handles many edge cases for you:

package main

import (
	"bufio"
	"os"
)

func main() {
	file, _ := os.Open("large_file.txt")
	defer file.Close()

	scanner := bufio.NewScanner(file)
	for scanner.Scan() {
		line := scanner.Bytes() // No copy
		processLine(line)
	}

	if err := scanner.Err(); err != nil {
		panic(err)
	}
}

Key advantages:

Handles different line endings automatically
Provides scanner.Bytes() for zero-copy line access
Automatically manages buffers
Has built-in size limits to prevent memory exhaustion

However, Scanner has limitations: by default, it has a 64KB line size limit. For larger lines, use scanner.Buffer():

// Allow up to 1MB lines
scanner := bufio.NewScanner(file)
buf := make([]byte, 0, 64*1024)
scanner.Buffer(buf, 1024*1024)

Buffered vs Unbuffered: Benchmarks

Here's a realistic benchmark comparing different approaches:

package main

import (
	"bufio"
	"bytes"
	"io"
	"os"
	"testing"
)

const testFile = "testdata.bin"

func init() {
	// Create a 100MB test file
	data := bytes.Repeat([]byte("benchmark test data line\n"), 4194304)
	os.WriteFile(testFile, data, 0644)
}

// Unbuffered: read byte-by-byte
func readUnbuffered(path string) error {
	file, _ := os.Open(path)
	defer file.Close()

	var buf [1]byte
	for {
		_, err := file.Read(buf[:])
		if err == io.EOF {
			break
		}
		if err != nil {
			return err
		}
	}
	return nil
}

// Buffered with default size
func readBufferedDefault(path string) error {
	file, _ := os.Open(path)
	defer file.Close()

	reader := bufio.NewReader(file)
	_, err := io.Copy(io.Discard, reader)
	return err
}

// Buffered with large buffer
func readBufferedLarge(path string) error {
	file, _ := os.Open(path)
	defer file.Close()

	reader := bufio.NewReaderSize(file, 256*1024)
	_, err := io.Copy(io.Discard, reader)
	return err
}

// Line-by-line with Scanner
func readScanner(path string) error {
	file, _ := os.Open(path)
	defer file.Close()

	scanner := bufio.NewScanner(file)
	for scanner.Scan() {
		_ = scanner.Bytes()
	}
	return scanner.Err()
}

func BenchmarkReadUnbuffered(b *testing.B) {
	b.ReportAllocs()
	for i := 0; i < b.N; i++ {
		readUnbuffered(testFile)
	}
}

func BenchmarkReadBufferedDefault(b *testing.B) {
	b.ReportAllocs()
	for i := 0; i < b.N; i++ {
		readBufferedDefault(testFile)
	}
}

func BenchmarkReadBufferedLarge(b *testing.B) {
	b.ReportAllocs()
	for i := 0; i < b.N; i++ {
		readBufferedLarge(testFile)
	}
}

func BenchmarkReadScanner(b *testing.B) {
	b.ReportAllocs()
	for i := 0; i < b.N; i++ {
		readScanner(testFile)
	}
}

Expected results (approximate, varies by hardware):

Unbuffered: ~50-100x slower than buffered
Buffered (4KB): Baseline
Buffered (256KB): ~1.2-1.5x faster for sequential reads
Scanner: Similar to buffered, slightly more overhead

Flushing Strategies

Writers buffer data in memory. You must call Flush() to ensure data actually reaches the underlying writer:

package main

import (
	"bufio"
	"os"
)

func main() {
	file, _ := os.Create("output.txt")
	defer file.Close()

	writer := bufio.NewWriter(file)

	// Write lots of data
	for i := 0; i < 1000; i++ {
		writer.WriteString("data")
	}

	// Critical: flush before closing!
	if err := writer.Flush(); err != nil {
		panic(err)
	}
}

Forgetting to flush is a common bug. Best practice: defer the flush:

writer := bufio.NewWriter(file)
defer writer.Flush()

// All writes will be flushed on return

For long-running operations, flush periodically to avoid data loss:

writer := bufio.NewWriter(file)
defer writer.Flush()

for i := 0; i < 1000000; i++ {
	writer.WriteString(getData(i))

	// Flush every 10k writes
	if i%10000 == 0 {
		writer.Flush()
	}
}

Real Example: Processing a Large Log File

Here's a practical example processing a multi-gigabyte log file efficiently:

package main

import (
	"bufio"
	"bytes"
	"fmt"
	"os"
	"strings"
	"sync/atomic"
)

type LogStats struct {
	totalLines   int64
	errorCount   int64
	warningCount int64
	otherCount   int64
}

func (ls *LogStats) Record(level string) {
	switch level {
	case "ERROR":
		atomic.AddInt64(&ls.errorCount, 1)
	case "WARN":
		atomic.AddInt64(&ls.warningCount, 1)
	default:
		atomic.AddInt64(&ls.otherCount, 1)
	}
	atomic.AddInt64(&ls.totalLines, 1)
}

func processLargeLogFile(path string) (*LogStats, error) {
	file, err := os.Open(path)
	if err != nil {
		return nil, err
	}
	defer file.Close()

	stats := &LogStats{}

	// Use a 256KB buffer for efficient reading
	scanner := bufio.NewScanner(file)
	buf := make([]byte, 0, 64*1024)
	scanner.Buffer(buf, 256*1024)

	for scanner.Scan() {
		line := scanner.Bytes()

		// Fast path: check for level prefix
		if bytes.HasPrefix(line, []byte("ERROR")) {
			stats.Record("ERROR")
		} else if bytes.HasPrefix(line, []byte("WARN")) {
			stats.Record("WARN")
		} else {
			stats.Record("OTHER")
		}
	}

	if err := scanner.Err(); err != nil {
		return nil, err
	}

	return stats, nil
}

func main() {
	stats, err := processLargeLogFile("application.log")
	if err != nil {
		panic(err)
	}

	fmt.Printf("Total lines: %d\n", stats.totalLines)
	fmt.Printf("Errors: %d\n", stats.errorCount)
	fmt.Printf("Warnings: %d\n", stats.warningCount)
	fmt.Printf("Other: %d\n", stats.otherCount)
}

This approach:

Uses a large 256KB buffer for efficient syscalls
Avoids copying log lines with scanner.Bytes()
Uses atomic operations for thread-safe stats updates
Processes multi-GB files efficiently

io.Copy with Buffered Readers

For copying data between I/O sources, io.Copy automatically uses efficient buffering:

package main

import (
	"bufio"
	"io"
	"os"
)

func main() {
	source, _ := os.Open("large_input.bin")
	defer source.Close()

	dest, _ := os.Create("large_output.bin")
	defer dest.Close()

	// io.Copy internally uses a 32KB buffer
	bytes, _ := io.Copy(dest, source)
	println("Copied", bytes, "bytes")
}

For maximum performance, wrap both ends with bufio:

source, _ := os.Open("input.bin")
bufferedSrc := bufio.NewReaderSize(source, 256*1024)

dest, _ := os.Create("output.bin")
bufferedDest := bufio.NewWriterSize(dest, 256*1024)

io.Copy(bufferedDest, bufferedSrc)
bufferedDest.Flush()

This ensures large buffers are used throughout the copy operation.

Key Takeaways

Always buffer I/O in production code—the performance difference is dramatic
Choose buffer sizes based on benchmarks of your specific workload
Remember to flush buffered writers before program exit
Use Scanner for line-by-line processing—it's convenient and efficient
Use io.Copy for efficient data transfer between sources
Profile first—measure your I/O patterns to find bottlenecks

Buffering is often the single most impactful optimization for I/O-bound Go applications.

Buffered I/O

On this page