Tracing Span Design: How Many Is Too Many
Balancing trace granularity against overhead, storage, and the ability to actually read trace waterfalls.
Balancing trace granularity against overhead, storage, and the ability to actually read trace waterfalls.
- File type
- Pages
- 31 pages
- File size
- 1.5 MB
More spans provide more visibility—that’s the intuition. But each span has costs: CPU overhead for creation, memory for attributes, network bandwidth for export, storage in your backend, and cognitive load reading traces. A team instrumented every function call, database query, cache lookup, and external request, generating 200+ spans per request. The trace waterfall became solid color with no visible hierarchy. Debugging meant scrolling through hundreds of spans. Storage costs tripled in a month. They refactored to 15-20 spans per request, traces became readable, critical paths were obvious, and storage costs dropped 90%.
Granularity without readability is noise, not observability.
This complete guide teaches you:
- Span fundamentals: identity, naming, timing, status, attributes, and events
- Context propagation: automatic parent-child relationships and trace continuity
- Span hierarchy: designing readable waterfalls with meaningful nesting
- Overhead analysis: CPU, memory, network, and storage costs by span count
- Sampling strategies: probabilistic and dynamic sampling to control costs
- Semantic conventions: following OpenTelemetry standards for consistency
- Common instrumentation patterns: HTTP, database, cache, and messaging
- Debugging traces: finding slow operations in readable waterfalls
- Tail-based sampling: collecting traces based on error or duration
Download Your OpenTelemetry Span Design Guide now to instrument your system with the right granularity without creating noise.
Tracing Span Design: How Many Is Too Many
Fill out the form below to receive your pdf instantly.