Read-Heavy Workloads

Read-heavy workloads depend on key layout, cache behavior, segment shape, and iterator lifetime.

Key Layout

Design keys around the reads you need. Ordered storage is powerful when related records are adjacent.

Examples:

userId:timestamp
tenantId:entityType:entityId
indexName:term:documentId

Cache Behavior

Disk block cache is the main cache for compressed disk segment reads. It stores decompressed compression blocks and is cleaned by the maintainer.

It is most effective when the working set is smaller than available memory or reads repeatedly touch nearby key ranges.

Random reads over a huge keyspace rely more heavily on disk and sparse index efficiency.

See read-path caching.

Symptom Guide

SymptomLikely pressureFirst actions
Point reads slow downtoo many segments, sparse index density, cold block cachekeep maintenance active; tuneDefaultSparseArrayStepSize; reviewBlockCacheLifeTime
Range scans disturb hot readsone-off scans contribute to cache pressurekeep iteratorcontributeToTheBlockCachedisabled for one-off scans
Repeated hot-key reads hit disk too oftenkey/value circular caches too small or short-livedincreaseKeyCacheSize,ValueCacheSize, or cache lifetimes
Latest-first reads are awkwardkey layout or iterator direction is mismatcheduseCreateReverseIteratoror encode descending keys intentionally
Scans keep old files alivelong-lived iterators pin segmentsdispose iterators promptly and keep scan scopes short

Segment Count

Too many segments can increase read amplification. Maintenance and merge behavior help keep the read path efficient.

Disk segment sparse arrays and cache settings can also affect point lookup and seek performance.

For compressed disk reads, tune block cache lifetime before increasing circular key/value caches. Circular caches help repeated reads of the same record indexes; block cache helps repeated reads of nearby compressed blocks.

Iterators

Use iterators for range scans instead of many independent point reads.

Dispose iterators promptly so they do not pin segments longer than needed.

Compression

Compression can help read-heavy workloads if IO is the bottleneck and data compresses well. It can hurt if CPU is the bottleneck.

Benchmark with your real data shape.