K8s by Example: Distributed Tracing
| Distributed tracing follows requests as they flow through microservices. Each service adds spans to a trace, showing timing and relationships. Based on Google’s Dapper paper, tools like Jaeger and Zipkin collect traces. Use for: debugging latency, understanding dependencies, finding bottlenecks. |
| terminal | |
| Quick start: deploy Jaeger for trace collection and visualization. For production, use the Jaeger Operator or Helm chart with persistent storage. The all-in-one image is suitable for development. | |
| otel-sidecar.yaml | |
| The OpenTelemetry Collector sidecar receives traces from the application and exports to backends (Jaeger, Zipkin, Tempo). Apps send traces to localhost; the sidecar handles batching, retry, and export. | |
| otel-collector-config.yaml | |
| Collector config defines receivers (how traces come in), processors (batching, filtering), and exporters (where traces go). This config receives OTLP traces and exports to Jaeger. | |
| trace-propagation.yaml | |
| Trace context propagation passes trace IDs between services via HTTP headers. The W3C Trace Context standard uses | |
| jaeger-deployment.yaml | |
| Jaeger all-in-one deployment for development. Includes collector, query UI, and in-memory storage. For production, use separate components with persistent storage (Elasticsearch, Cassandra). | |
| sampling-config.yaml | |
| Sampling reduces trace volume in high-traffic systems. Head-based sampling decides at trace start; tail-based samples after seeing the full trace. Configure ratio (0.1 = 10%) or use adaptive sampling based on traffic. | |
| terminal | |
| Access Jaeger UI to search traces by service, operation, tags, or duration. View trace timelines showing span relationships, identify slow services, and analyze error paths. | |