Tracing Span Design: How Many Is Too Many

Two city maps comparison: over-detailed map with every feature marked versus clear map showing major roads and landmarks for easy navigation

A team I worked with instrumented a new service with spans for every function call, database query, cache lookup, and external request. Thorough, right? A typical request generated 200+ spans. The trace backend showed a waterfall of solid color—no gaps, no obvious structure. Finding the slow operation meant scrolling through hundreds of spans, mentally filtering out the noise. Storage costs tripled in a month.

They refactored to instrument only service boundaries, significant I/O operations, and error paths. Instrumented service boundaries, significant I/O operations, and error paths. Span count dropped to 15-20 per request. Traces became readable. The critical path was obvious at a glance. Storage costs dropped 90%. Debugging time went from minutes to seconds.

The lesson: granularity without readability is noise, not observability. More spans don’t mean better visibility—they often mean worse. The goal is enough spans to debug problems, not so many that you create new ones.

What to Instrument

The question “should this operation have a span?” comes up constantly. Here’s how I think about it.

Always instrument service boundaries. Incoming HTTP/gRPC requests, outgoing HTTP/gRPC calls, message queue publish/consume—these are the spans that stitch your distributed trace together. Without them, your trace stops at service boundaries and you lose visibility into cross-service latency.

Always instrument I/O operations. Database queries, cache reads/writes, file system operations. I/O is where latency lives. A trace that doesn’t show database time is a trace that can’t explain why a request was slow.

Always instrument external dependencies. Third-party APIs, payment processors, cloud service SDKs. External services are the leading cause of incidents. When Stripe is slow or an S3 call times out, you want that visible in your traces.

Instrument significant business operations. Order processing, payment validation, user authentication—operations that matter to your business and that you might need to debug. The keyword is “significant.” Not every function, just the ones you’d want to see in a trace when something goes wrong.

Some operations need judgment. For loops with external calls, wrap the loop, not each iteration—a batch that processes 100 items should create one span for the batch, not 100 spans. For retry logic, one span per attempt can be useful for debugging retry storms, but link them together so you can see the full retry sequence. And on the other end of the spectrum, there are operations you should actively avoid instrumenting.

Avoid instrumenting pure computation. JSON parsing, field validation, data transformation—these are CPU-bound operations that rarely need their own spans. If you need timing, add it as an attribute on the parent span.

Avoid instrumenting every function call. This is the trap that creates unreadable traces. Instrument boundaries and I/O, not internal implementation details.

OperationSpan?AlternativeWhy
HTTP request handlerYes-Service boundary; stitches distributed trace together
Database queryYes-I/O operation; where latency typically lives
JSON parsingNoAttributePure computation; no I/O involved
Each loop iterationNoWrap loopCreates overhead; one span for batch is sufficient
External API callYes-Dependency visibility; external services cause most incidents
Field validationNoAttributeToo granular; timing rarely needed
Error handlingNoEventWithin parent span; events capture error context
Span decision matrix.

Span vs Event vs Attribute

Once you’ve decided something deserves visibility in a trace, you have three options: span, event, or attribute. Each has different costs and use cases.

Spans represent operations with duration. They have start and end times, can have child spans, and appear as bars in your waterfall. Use them for I/O operations and significant boundaries. Cost: highest (memory allocation, context propagation, serialization).

Events are timestamped points within a span’s lifetime. They don’t have duration—they mark moments. Use them for milestones: “validation started,” “cache miss,” “retry attempted.” Cost: moderate (stored with parent span, but no separate context).

Attributes are key-value metadata attached to a span. They describe the operation: IDs, counts, flags, outcomes. Use them for context that helps you understand the span. Cost: lowest (just map entries).

Here’s the pattern in practice:

import { trace, SpanKind, SpanStatusCode } from '@opentelemetry/api';

const tracer = trace.getTracer('order-service');

async function processOrder(order: Order) {
  return tracer.startActiveSpan('processOrder', async (span) => {
    span.setAttribute('order.id', order.id);
    span.setAttribute('order.item_count', order.items.length);

    span.addEvent('validation.started');
    const validationResult = validateOrder(order);
    span.addEvent('validation.completed', { 'validation.passed': validationResult.valid });

    if (!validationResult.valid) {
      span.setAttribute('validation.error', validationResult.error);
      span.setStatus({ code: SpanStatusCode.ERROR });
      span.end();
      return;
    }

    await tracer.startActiveSpan('db.insertOrder', async (dbSpan) => {
      dbSpan.setAttribute('db.system', 'postgresql');
      await db.insert(order);
      dbSpan.end();
    });

    span.addEvent('order.persisted');
    span.setStatus({ code: SpanStatusCode.OK });
    span.end();
  });
}
Using spans for I/O, events for milestones, attributes for metadata.

Notice that validateOrder() doesn’t get a span—it’s pure computation. The validation timing is captured as events if you need it. The database insert does get a span because it’s I/O where latency matters.

Success callout:

Rule of thumb: if it involves I/O (network, disk, database), make it a span. If it’s a milestone within an operation, make it an event. If it’s metadata about the operation, make it an attribute. Spans are expensive; events and attributes are cheap.

Trace Readability

A trace is only useful if you can read it. I’ve seen traces that technically contain all the information needed to debug a problem, but the information is buried in noise.

Most readability problems fall into a few common anti-patterns:

Wall of spans. The waterfall is solid color—no gaps, no white space. You can’t see the structure because there’s a span for everything. Every function call, every loop iteration, every trivial operation. The fix is reducing span count and using events for milestones instead of child spans.

Flat hierarchy. All spans at the same level, no nesting. You can’t tell which operations are contained within others, or whether operations ran sequentially or in parallel. The fix is ensuring child spans inherit from their parents using startActiveSpan.

Missing gaps. Spans account for 100% of request time—no uninstrumented periods visible. This sounds good but actually hides information. The gaps in a trace show where time went to uninstrumented code. If there are no gaps, you can’t distinguish “the database was slow” from “the instrumentation overhead was high.”

Cryptic names. Span names like “span,” “operation,” “handler,” or “process” that don’t explain what’s happening. Auto-instrumentation often produces these. The fix is customizing span names to follow the pattern “operation resource” (HTTP GET /api/orders, db.query orders).

Attribute explosion. Hundreds of attributes per span because someone dumped all available context. The trace viewer loads slowly, important attributes are buried in noise, and storage costs balloon. The fix is selecting relevant attributes deliberately.

newsletter.subscribe

$ Stay Updated

> One deep dive per month on infrastructure topics, plus quick wins you can ship the same day.

$

You'll receive a confirmation email. Click the link to complete your subscription.

A good trace has 10-30 spans per request, 3-5 levels of nesting, clear names that explain each operation, and visible gaps that show where uninstrumented time went. You should be able to identify the critical path at a glance.

Naming is particularly important for quick scanning. Span names should answer “what operation on what resource?” without requiring you to read the code. The OpenTelemetry semantic conventions provide a good starting point, and the table below shows common patterns and pitfalls.

ContextGood NameBad NameProblem with Bad Name
HTTP handlerHTTP GET /api/orders/{id}handlerNo indication of method or resource
Databasedb.query ordersSELECT * FROM...SQL syntax noise, potential sensitive data
Cachecache.getredisTechnology not operation—what did you do?
Business logicorder.validatevalidateOrderFunction name leaks implementation detail
External APIHTTP POSTAPI callNo method, no way to distinguish calls
Span naming patterns.

Conclusion

Span design is an engineering tradeoff: visibility versus overhead, granularity versus readability, detail versus cost. The goal isn’t maximum spans—it’s enough spans to debug problems efficiently.

Free PDF Guide

Tracing Span Design: How Many Is Too Many

Balancing trace granularity against overhead, storage, and the ability to actually read trace waterfalls.

What you'll get:

  • Span granularity decision matrix
  • Event versus span guidelines
  • Trace readability audit checklist
  • Sampling strategy comparison guide
PDF download

Free resource

Instant access

No credit card required.

Instrument service boundaries, I/O operations, and significant business logic. Use events for milestones within spans. Use attributes for metadata that helps filtering and debugging. The best traces answer three questions: what happened, where time was spent, and what failed.

Audit your current traces: pick a typical request, count the spans, and ask whether you could identify the slow operation in under 10 seconds. If not, you’ve found your first refactoring target.

Info callout:

Start with minimal instrumentation—auto-instrumentation plus key business operations—then add spans only when you can’t debug a specific problem. You can always add granularity; removing it requires code changes. Let debugging needs drive instrumentation, not the quest for “complete” visibility.

Table of Contents

Share this article

Found this helpful? Share it with others who might benefit.

Share this article

Enjoyed the read? Share it with your network.

Other things I've written