In modern Kubernetes environments, the hardest part of observability is often not collecting telemetry. The harder part is rolling it out consistently and cost-effectively across all services.
Adding an SDK to every service is powerful, but in polyglot, multi-team, and multi-service environments, it introduces development, maintenance, and release overhead. This is where Grafana Beyla comes in with its eBPF-based zero-code instrumentation approach.
In this article, we will first look at what eBPF is, then how Grafana Beyla fits into application observability, and finally how it compares with SDK-based instrumentation.
What Is eBPF?
eBPF, short for extended Berkeley Packet Filter, is a technology that allows small programs to run safely and in a controlled way inside the Linux kernel.
Normally, the operating system kernel manages low-level events such as network packets, file access, process creation, system calls, and socket operations. Running code at this level is powerful, but also risky. Poorly written kernel code can make the entire system unstable.
eBPF solves this with a safer model. It allows small programs to be loaded into the kernel, but these programs are verified through strict safety checks before they are executed. This makes it possible to observe, filter, or in some cases control kernel-level events in a safer and more flexible way compared to writing traditional kernel modules.
eBPF programs attach to specific hook points in the kernel. For example, when a network packet arrives, a system call is executed, a process starts, or a socket connection is opened, an eBPF program can be triggered to read relevant information or take a specific action.
eBPF is commonly used in four main areas:
| Area | What eBPF can do |
|---|---|
| Observability | Trace, metric, latency, and network visibility |
| Security | Detect suspicious syscalls, file access, and process behavior |
| Networking | Packet filtering, load balancing, and traffic routing |
| Performance | Analyze CPU, I/O, syscalls, and kernel functions |
For example, with eBPF you can:
- Observe which system calls a process is making.
- See which container is connecting to which IP address.
- Filter network packets with high performance.
- Measure application latency from the kernel level.
- Detect suspicious security behavior.
- Map service-to-service traffic in Kubernetes environments.
The main reason eBPF is powerful for monitoring and observability is this:
It provides kernel-level visibility without changing application code or recompiling the system.
This is why eBPF has become popular in the Kubernetes ecosystem. It can observe traffic at the pod, container, service, and node level without touching application code. Many modern tools such as Cilium, Pixie, Parca, Falco, Tetragon, and Grafana Beyla use eBPF.
What Is Grafana Beyla?
Grafana Beyla is an eBPF-based application auto-instrumentation tool that helps teams get started with application observability quickly and with minimal effort.
Using eBPF, Beyla automatically inspects application executables and the operating system networking layer. This allows it to collect trace spans and RED metrics for web transactions from HTTP, HTTPS, and gRPC services running on Linux.
RED metrics answer three basic questions:
- Rate: How much traffic is the service receiving?
- Errors: How many errors is it producing?
- Duration: How long do requests take?
The most important advantage of Beyla is that it can generate this telemetry without requiring any change to the application code or configuration. Developers do not need to add an SDK, write manual instrumentation code, or reconfigure services.
This makes Beyla a strong zero-code observability layer, especially in Kubernetes environments.
The Operational Cost of Adding SDKs
The traditional approach to application observability usually starts with adding an SDK or instrumentation library to the application code.
For example, when using the OpenTelemetry SDK, each service needs the relevant language-specific dependency added to its codebase. Framework integrations need to be configured, exporter settings need to be defined, and in some cases developers need to write manual spans, attributes, or context propagation logic.
This approach can produce powerful and detailed telemetry. However, in multi-service environments, it also introduces significant operational and development cost.
Each programming language, framework, and runtime may require different SDKs, configuration patterns, and version management. In environments where Go, Java, Node.js, Python, and .NET services coexist, this process becomes even more complex.
SDK-based instrumentation is also tied to the application release lifecycle. A telemetry-related change often requires the application to be rebuilt, tested, and deployed. This turns observability improvements into work that depends on each application team’s development schedule, rather than something the platform team can manage independently.
In addition, every SDK added to an application becomes a new dependency. Its version compatibility, security vulnerabilities, performance impact, and behavior with the application framework must be managed.
A misconfigured or incompatible instrumentation library can directly affect the application’s memory usage, CPU consumption, latency, or error behavior.
Manual instrumentation is another cost factor. Developers need to decide which endpoints, operations, custom attributes, and error scenarios should be instrumented. This makes standardization harder and can lead to inconsistent telemetry across services.
One team may produce detailed spans, while another may only emit basic metrics. As a result, the overall observability experience becomes fragmented.
Beyla’s Operational Advantage
One of Beyla’s biggest advantages is not only that it is technically zero-code, but also that it provides an operationally low-cost observability model.
With SDK-based instrumentation, each service requires separate work. Each language SDK, framework integration, service team release process, and application dependency lifecycle adds operational overhead.
In large Kubernetes environments with tens or hundreds of services, this becomes a serious coordination cost.
Beyla centralizes this cost. Instead of requiring application teams to change code one service at a time, a platform or observability team can deploy Beyla into the Kubernetes environment and make many services observable at once.
This is especially useful in environments where services are written in different languages.
Another operational advantage of Beyla is that it works independently of the application release cycle. Improving observability does not require every service to be rebuilt, tested, and redeployed.
Beyla’s configuration can be updated, a new version can be rolled out, or telemetry export settings can be changed centrally without touching application code.
This also speeds up onboarding. When a new service is deployed to the cluster, Beyla can automatically observe it based on the configured discovery rules.
This means teams do not need to repeatedly ask questions such as:
- Has the SDK been added?
- Is the exporter configured correctly?
- Is trace context propagation working?
- Are metric names standardized?
- Is the service using the correct resource attributes?
Since Beyla runs outside the application process, it also reduces operational risk. With SDK or agent-based instrumentation, a performance issue in the instrumentation layer can directly affect the application’s CPU, memory, or latency behavior.
Beyla observes applications externally through eBPF and does not add an extra dependency inside the application. Since it does not run inside the application process, it does not directly add overhead to the application’s memory space, thread model, or locking behavior.
Of course, the Beyla agent itself consumes resources on the node. However, this resource usage is managed as part of the observability layer, not inside the application process.
This creates a simpler model from both a security and maintenance perspective.
Beyla’s Difference in Latency Measurement
One of Beyla’s strengths is that it measures latency from the kernel networking layer, not from inside the application.
SDK or agent-based instrumentation often measures only the service time. In other words, it measures how long the request handler takes to process the request.
However, a request may wait inside a framework queue or thread pool before it reaches the handler. For example, if all worker threads are busy, a new incoming request may have to wait before it can be processed. This waiting time is often not captured by library-level instrumentation.
This can produce misleading results, especially under heavy traffic.
From inside the application, request duration may look low. But from the client’s perspective, the experienced latency may be much higher. This difference becomes critical when SLOs are based on response time.
Because Beyla observes traffic externally from the kernel networking layer, its latency metrics are closer to what the client actually experiences. These measurements include not only request processing time, but also queueing delays that may occur when the application is overloaded.
This makes Beyla’s latency metrics especially useful in overload scenarios, where they can better reflect the actual user experience.
Does Beyla Solve Everything?
It is important to position Beyla correctly.
Beyla is especially strong for baseline observability. It is a good starting point for collecting rate, error, and duration metrics, building basic service relationships, and providing transaction-level visibility for supported protocols.
However, Beyla is not a complete replacement for SDK-based distributed tracing.
Beyla observes the application from the outside. It can see request-response behavior at the kernel and network level. But it cannot understand business logic, function calls, framework lifecycle events, or domain-specific context as well as instrumentation inside the application code.
For example, with an SDK you can create internal spans like this:
POST /checkout
├─ validate-cart
├─ calculate-discount
├─ reserve-inventory
├─ call-payment-provider
└─ create-order
Beyla, in most scenarios, sees the transaction or request level:
POST /checkout
This means Beyla can observe that a request reached the application and received a response. But it cannot automatically know that internal business steps such as validate-cart, reserve-inventory, or create-order happened inside the application.
Similarly, with an SDK you can add domain-specific information to spans:
customer.id
order.id
payment.provider
tenant.id
cart.item_count
subscription.plan
Beyla does not know this information as application memory-level business context. The information it sees from the network level is more general, such as method, route, status code, latency, and protocol.
Therefore, SDK or manual instrumentation is still the right solution for questions such as:
- Which tenant is experiencing higher latency?
- Which payment provider has the highest error rate?
- Which order type is slower?
- Which business step triggered a retry?
- Which cache operation is slow?
- Which internal function is increasing request duration?
Beyla’s goal is not to extract all of these details from inside the application. Its strength is providing broad baseline visibility without touching application code.
When to Use Beyla and When to Use an SDK
In practice, it is better to think of Beyla and SDKs not as direct replacements for each other, but as two approaches that provide visibility at different levels.
Beyla is a good choice when:
- You want to start observability without changing application code.
- You want fast baseline visibility in Kubernetes.
- You want to collect RED metrics from all services.
- You want to see basic service-to-service relationships.
- You run a polyglot service environment.
- You want to observe legacy or third-party services.
- You want to reduce SDK rollout cost.
An SDK is a better choice when:
- You need detailed and reliable distributed tracing.
- You want to create internal application spans.
- You need framework or library-level details.
- You want to collect runtime metrics.
- You need to add business context to traces.
- You want to create custom span attributes or events.
- You need domain-specific telemetry.
A balanced model usually looks like this:
Beyla -> baseline metrics + service graph + basic transaction visibility
SDK -> deep distributed tracing + business context + internal spans
In other words, Beyla can be used as the default observability layer across the cluster, while SDKs can be reserved for critical services that require deeper application-level visibility.
With this model, teams can start quickly with a low operational cost and enrich the system with more detailed tracing only where it is actually needed.
Conclusion
Grafana Beyla is a powerful tool for starting application observability in Kubernetes environments without touching application code.
Thanks to eBPF, it can generate RED metrics, basic transaction spans, and service relationships from HTTP, HTTPS, and gRPC services without running inside the application process. This approach can significantly reduce SDK rollout cost, especially in multi-service and polyglot environments.
However, Beyla should not be treated as a complete replacement for SDK-based distributed tracing. It is better understood as a zero-code baseline observability layer. When application-level business context, custom spans, framework details, or domain-specific telemetry are required, SDKs are still the right tool.
My practical takeaway is this:
Use Beyla as the default observability layer, and introduce SDKs only for critical services that require deep tracing.