Sampling¶
Sampling controls which traces are captured and exported. In production systems with high traffic, sampling every trace can be prohibitively expensive. Prela provides multiple sampling strategies to balance observability with performance.
Overview¶
Sampling decisions are made at the trace level (not individual spans). Once a trace is sampled, all spans in that trace are captured. This ensures complete trace data for sampled requests.
graph TD
A[Incoming Request] --> B{Should Sample?}
B -->|Yes| C[Create Trace]
B -->|No| D[Skip Tracing]
C --> E[Capture All Spans]
C --> F[Export Trace]
D --> G[No Overhead]
Sampling Strategies¶
Prela provides four built-in samplers:
| Sampler | Use Case | Overhead |
|---|---|---|
AlwaysOnSampler |
Development, debugging | Highest |
AlwaysOffSampler |
Disable tracing | None |
ProbabilitySampler |
Production (probabilistic) | Medium |
RateLimitingSampler |
Production (rate limiting) | Low |
AlwaysOnSampler¶
Samples every trace. Useful for development and low-traffic environments.
from prela import init
from prela.core.sampler import AlwaysOnSampler
tracer = init(
service_name="my-app",
sampler=AlwaysOnSampler()
)
# Or use sample_rate=1.0
tracer = init(service_name="my-app", sample_rate=1.0)
Characteristics: - ✅ Complete observability - ✅ Simple and predictable - ❌ Not suitable for high-traffic production - ❌ Can generate massive trace volumes
AlwaysOffSampler¶
Never samples traces. Useful for temporarily disabling tracing.
from prela import init
from prela.core.sampler import AlwaysOffSampler
tracer = init(
service_name="my-app",
sampler=AlwaysOffSampler()
)
# Or use sample_rate=0.0
tracer = init(service_name="my-app", sample_rate=0.0)
Characteristics: - ✅ Zero overhead - ✅ Instant disable without code changes - ❌ No observability
Environment Variable:
ProbabilitySampler¶
Samples traces with a given probability (0.0 to 1.0). Uses deterministic hashing to ensure consistent sampling decisions for the same trace ID.
from prela import init
from prela.core.sampler import ProbabilitySampler
# Sample 10% of traces
tracer = init(
service_name="my-app",
sampler=ProbabilitySampler(rate=0.1)
)
# Or use sample_rate parameter
tracer = init(service_name="my-app", sample_rate=0.1)
How It Works¶
- Compute MD5 hash of trace ID
- Convert hash to float (0.0 to 1.0)
- Sample if hash value < sampling rate
Key Property: The same trace ID always produces the same sampling decision across all services in a distributed system.
from prela.core.sampler import ProbabilitySampler
sampler = ProbabilitySampler(rate=0.1)
trace_id = "550e8400-e29b-41d4-a716-446655440000"
result1 = sampler.should_sample(trace_id) # True or False
result2 = sampler.should_sample(trace_id) # Same as result1
# Deterministic: same trace_id = same decision
assert result1 == result2
Configuration¶
# Sample 1% of traces (recommended for high-traffic production)
ProbabilitySampler(rate=0.01)
# Sample 25% of traces (moderate traffic)
ProbabilitySampler(rate=0.25)
# Sample 50% of traces (development/staging)
ProbabilitySampler(rate=0.5)
Characteristics: - ✅ Predictable sampling rate - ✅ Deterministic (consistent across services) - ✅ Low computational overhead - ❌ Can miss rare events - ❌ Not aware of system load
RateLimitingSampler¶
Limits traces to a maximum rate (traces per second). Uses a token bucket algorithm.
from prela import init
from prela.core.sampler import RateLimitingSampler
# Allow max 10 traces per second
tracer = init(
service_name="my-app",
sampler=RateLimitingSampler(traces_per_second=10.0)
)
How It Works¶
Token bucket algorithm:
1. Bucket holds tokens (max = traces_per_second)
2. Tokens regenerate at traces_per_second rate
3. Each trace consumes 1 token
4. If bucket empty, trace is rejected
graph LR
A[Token Bucket] -->|Has tokens?| B{Sample}
B -->|Yes| C[Consume token]
B -->|No| D[Reject]
E[Token Regeneration] -->|traces_per_second| A
Configuration¶
# Low rate limit (high-traffic production)
RateLimitingSampler(traces_per_second=1.0)
# Medium rate limit
RateLimitingSampler(traces_per_second=10.0)
# High rate limit (staging)
RateLimitingSampler(traces_per_second=100.0)
Characteristics: - ✅ Guarantees max trace volume - ✅ Load-aware (rejects when busy) - ✅ Thread-safe - ❌ May reject important traces during bursts - ❌ Not deterministic (depends on timing)
Choosing a Sampler¶
Development¶
Use AlwaysOnSampler for complete visibility:
Staging/QA¶
Use ProbabilitySampler with moderate rate:
Production (Low Traffic)¶
Use ProbabilitySampler with 10-25% sampling:
Production (High Traffic)¶
Use RateLimitingSampler to cap trace volume:
from prela.core.sampler import RateLimitingSampler
tracer = init(
service_name="prod-app",
sampler=RateLimitingSampler(traces_per_second=10.0)
)
Production (Very High Traffic)¶
Combine low probability with rate limiting:
from prela.core.sampler import ProbabilitySampler
# Sample 1% of traces (typical for large-scale systems)
tracer = init(service_name="prod-app", sample_rate=0.01)
Environment-Based Configuration¶
Use environment variables for runtime configuration:
# Set sampling rate
export PRELA_SAMPLE_RATE=0.1
# Disable tracing
export PRELA_SAMPLE_RATE=0.0
# Enable full tracing
export PRELA_SAMPLE_RATE=1.0
Python code:
import os
from prela import init
# Reads from PRELA_SAMPLE_RATE environment variable
tracer = init(service_name="my-app")
# Override environment variable
tracer = init(service_name="my-app", sample_rate=0.5)
Custom Samplers¶
Implement custom sampling logic by extending BaseSampler:
from prela.core.sampler import BaseSampler
class UserBasedSampler(BaseSampler):
"""Sample based on user ID."""
def __init__(self, vip_users: set[str]):
self.vip_users = vip_users
def should_sample(self, trace_id: str) -> bool:
# Always sample VIP users
from prela.core.context import get_trace_context
ctx = get_trace_context()
user_id = ctx.baggage.get("user_id")
if user_id in self.vip_users:
return True
# Sample 1% of other users
import hashlib
hash_value = int.from_bytes(
hashlib.md5(trace_id.encode()).digest()[:8],
byteorder="big"
)
return (hash_value / (2**64 - 1)) < 0.01
# Use custom sampler
tracer = init(
service_name="my-app",
sampler=UserBasedSampler(vip_users={"user123", "user456"})
)
Head-Based vs Tail-Based Sampling¶
Head-Based Sampling (Prela's Approach)¶
Sampling decision made before trace execution:
# Decision made here (before span creation)
with get_tracer().span("operation") as span:
# If sampled, span is captured
# If not sampled, span is not created
do_work()
Advantages: - ✅ Low overhead (unsampled traces skipped) - ✅ Simple implementation - ✅ Predictable resource usage
Disadvantages: - ❌ May miss important traces (errors, slow requests) - ❌ Decision made without runtime context
Tail-Based Sampling (Not Currently Supported)¶
Sampling decision made after trace execution based on properties:
Advantages: - ✅ Can sample errors and slow requests - ✅ Context-aware decisions
Disadvantages: - ❌ Higher overhead (all traces captured initially) - ❌ Requires buffering - ❌ Complex implementation
Sampling Best Practices¶
1. Start with High Sampling, Reduce Gradually¶
# Development
tracer = init(service_name="app", sample_rate=1.0)
# Staging
tracer = init(service_name="app", sample_rate=0.5)
# Production (initial)
tracer = init(service_name="app", sample_rate=0.25)
# Production (stable)
tracer = init(service_name="app", sample_rate=0.05)
2. Monitor Sampling Rate Effectiveness¶
# Log sampling decisions for analysis
import logging
class LoggingSampler(BaseSampler):
def __init__(self, base_sampler: BaseSampler):
self.base_sampler = base_sampler
self.logger = logging.getLogger(__name__)
def should_sample(self, trace_id: str) -> bool:
result = self.base_sampler.should_sample(trace_id)
self.logger.debug(f"Trace {trace_id}: sampled={result}")
return result
3. Use Different Rates for Different Services¶
# Critical user-facing service: higher sampling
user_api_tracer = init(service_name="user-api", sample_rate=0.5)
# Background worker: lower sampling
worker_tracer = init(service_name="worker", sample_rate=0.01)
4. Consider Business Impact¶
class BusinessAwareSampler(BaseSampler):
def should_sample(self, trace_id: str) -> bool:
ctx = get_trace_context()
# Always sample premium customers
if ctx.baggage.get("customer_tier") == "premium":
return True
# Always sample payment operations
if ctx.baggage.get("operation_type") == "payment":
return True
# Sample 1% of others
return ProbabilitySampler(0.01).should_sample(trace_id)
5. Document Your Sampling Strategy¶
# Production sampling configuration
# - 1% baseline sampling for cost control
# - Premium customers: 100% sampling for SLA monitoring
# - Payment flows: 100% sampling for compliance
# - Expected volume: ~100 traces/sec at peak (from 10K req/sec)
tracer = init(
service_name="production-api",
sampler=BusinessAwareSampler()
)
Performance Impact¶
Sampling Overhead¶
| Sampler | CPU Overhead | Memory Overhead |
|---|---|---|
| AlwaysOnSampler | Negligible | Negligible |
| AlwaysOffSampler | None | None |
| ProbabilitySampler | ~10μs per decision | Negligible |
| RateLimitingSampler | ~5μs per decision | Negligible |
Unsampled Traces¶
When a trace is not sampled, Prela: - ❌ Does not create span objects - ❌ Does not capture attributes/events - ❌ Does not export data - ✅ Overhead: ~10μs for sampling decision
# With 0.01 sampling rate:
# - 99% of requests: ~10μs overhead (sampling decision only)
# - 1% of requests: Full tracing overhead
Troubleshooting¶
Problem: Too few traces¶
Symptom: Not seeing enough data in production.
Solution: Increase sampling rate temporarily:
Problem: Too many traces¶
Symptom: Overwhelming trace volume, high storage costs.
Solution: Decrease sampling rate or use rate limiting:
from prela.core.sampler import RateLimitingSampler
# Cap at 10 traces/second
tracer = init(sampler=RateLimitingSampler(traces_per_second=10.0))
Problem: Missing important traces¶
Symptom: Errors or slow requests not captured.
Solution: Implement custom sampler to always sample errors:
class ErrorAwareSampler(BaseSampler):
def should_sample(self, trace_id: str) -> bool:
# Start with low baseline
if ProbabilitySampler(0.01).should_sample(trace_id):
return True
# Check if error occurred (set via baggage)
ctx = get_trace_context()
if ctx.baggage.get("has_error"):
return True
return False
Next Steps¶
- Learn about Exporters for sending traces to backends
- Explore Context Propagation for distributed tracing
- See Production Setup for deployment patterns