DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Because the DevOps movement has redefined engineering responsibilities, SREs now have to become stewards of observability strategy.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Related

  • Observability for Browsers
  • Azure Observability
  • The Human Side of Logs: What Unstructured Data Is Trying to Tell You
  • Observability Agent Architecture

Trending

  • Introducing Graph Concepts in Java With Eclipse JNoSQL, Part 3: Understanding Janus
  • IoT and Cybersecurity: Addressing Data Privacy and Security Challenges
  • Introduction to Retrieval Augmented Generation (RAG)
  • Intro to RAG: Foundations of Retrieval Augmented Generation, Part 1
  1. DZone
  2. Testing, Deployment, and Maintenance
  3. Monitoring and Observability
  4. Observability With eBPF

Observability With eBPF

Take an in-depth look at eBPF, a technology that allows you to run sandboxed programs within the Linux kernel.

By 
Kranthi Kiran Erusu user avatar
Kranthi Kiran Erusu
·
Sep. 19, 24 · Tutorial
Likes (7)
Comment
Save
Tweet
Share
7.9K Views

Join the DZone community and get the full member experience.

Join For Free

eBPF is a groundbreaking technology that allows you to run sandboxed programs within the Linux kernel. This provides a safe and efficient way to extend the kernel's capabilities without modifying its source code.

High-Level Overview

Below is an overview of the stack:

High-level overview of eBPF

Interception Points and Probes

eBPF programs are triggered by events and execute when the kernel or an application reaches a specified interception point. Standard interception points include system calls, function starts/ends, kernel tracepoints, network events, among others.

Should there be no existing interception point for a specific requirement, one has the option to establish a kernel probe (kprobe) or user probe (uprobe), enabling the attachment of eBPF programs virtually anywhere within the kernel or user applications.

Once the appropriate interception point is pinpointed, the eBPF program can be introduced into the Linux kernel through the bpf system call, commonly facilitated by one of the eBPF libraries. 

Often, eBPF is employed not directly, but through frameworks such as Cilium, bcc, or bpftrace. These frameworks offer a layer of abstraction over eBPF, removing the need to write programs from scratch and instead allowing for the definition of intent-based directives that are then executed via eBPF.

In instances where no such abstraction layer is available, it becomes necessary to author programs directly. The Linux kernel mandates that eBPF programs be submitted as bytecode. Although it's technically feasible to code in bytecode directly, the prevalent practice is to use a compiler suite like LLVM to transmute pseudo-C code into eBPF bytecode.

Load, Verification, and Compilation

Upon loading into the Linux kernel, the eBPF program undergoes a two-stage process before being linked to the chosen interception point.

Firstly, a verification phase confirms the safety of the eBPF program. This phase checks that the program adheres to various criteria, such as:

  • The entity loading the eBPF program possesses the necessary capabilities (privileges). In the absence of enabled unprivileged eBPF, only privileged entities can load eBPF programs.
  • The program is stable and does not jeopardize the system.
  • The program is guaranteed to complete its execution (i.e., it won't enter an infinite loop, thereby stalling further processes).

Load, Verification, and Compilation

The Just-in-Time (JIT) compilation step translates the generic bytecode of the program into the machine specific instruction set to optimize execution speed of the program. This makes eBPF programs run as efficiently as natively compiled kernel code or as code loaded as a kernel module.

eBPF maps are like data structures that can be accessed both from within eBPF programs and from user-space applications using a system call. This allows eBPF programs to share information and store state.

Why Use eBPF for O11y?

  • Performance: Since eBPF runs in the kernel and doesn’t need to switch between user space and kernel space frequently, it is highly efficient and has very low overhead.
  • Flexibility: eBPF can be used for various use cases like network monitoring, security enforcement, debugging, and performance tuning.
  • Non-intrusive: eBPF can gather detailed system data without modifying application code or rebooting the system.

Example: Tracing File Opens With eBPF

Imagine you want to monitor and trace every time a process opens a file on your system. eBPF can achieve this by hooking into the openat() syscall (a system call used to open files in Linux) and providing real-time insights without affecting system performance.

Here's a simple step-by-step example using bcc (BPF Compiler Collection), a popular front-end for writing eBPF programs:

Step 1: Install bcc

First, you need to install bcc tools on your system. For example, on Ubuntu:

Shell
 
sudo apt-get install bpfcc-tools linux-headers-$(uname -r)


Step 2: Write a Simple eBPF Program

This eBPF program traces the openat() syscall, logs the process ID (PID), process name, and file path each time a file is opened.

Python
 
#!/usr/bin/python

from bcc import BPF

# eBPF program that hooks into the openat syscall
bpf_code = """
#include <uapi/linux/ptrace.h>
#include <linux/sched.h>

struct data_t {
    u32 pid;
    char comm[TASK_COMM_LEN];
    char fname[256];
};

BPF_PERF_OUTPUT(events);
int trace_openat(struct pt_regs *ctx, int dfd, const char __user *filename, int flags) {
    struct data_t data = {};

    // Capture process ID and name
    data.pid = bpf_get_current_pid_tgid() >> 32;
    bpf_get_current_comm(&data.comm, sizeof(data.comm));

    // Capture file name
    bpf_probe_read_user(&data.fname, sizeof(data.fname), filename);

    // Send the data to user-space
    events.perf_submit(ctx, &data, sizeof(data));
    return 0;
}
"""
# Load the eBPF program
b = BPF(text=bpf_code)

# Attach eBPF program to the openat syscall
b.attach_kprobe(event="sys_openat", fn_name="trace_openat")

# Function to print the output
def print_event(cpu, data, size):
    event = b["events"].event(data)
    print(f"PID: {event.pid}, Process: {event.comm.decode('utf-8')}, File: {event.fname.decode('utf-8', 'replace')}")

# Open a perf buffer to receive events from kernel space
b["events"].open_perf_buffer(print_event)

# Continuously listen for events and print them
while True:
    b.perf_buffer_poll()


Step 3: Explanation of the Code

  1. Hooking the syscall: The eBPF program hooks into the openat() system call using kprobes (kernel probes). This allows the program to execute whenever any process calls openat() to open a file.
  2. Data collection: The eBPF program captures the process ID, process name (comm), and the name of the file being opened. It uses helper functions like bpf_get_current_comm() to get the process name and bpf_probe_read_user() to read the filename from user space.
  3. Sending data to user-space: The eBPF program uses a perf buffer (events.perf_submit()) to send this data from the kernel space back to the user-space, where we print the information.
  4. User-space script: The Python script attaches the eBPF program to the openat() syscall and prints out the data collected from kernel space, such as the process name and file being opened.

Step 4: Running the eBPF Program

Once you run the script with root privileges, you will see output like this:

Shell
 
PID: 12345, Process: bash, File: /etc/passwd
PID: 12346, Process: vim, File: /home/user/file.txt
PID: 12347, Process: python, File: /tmp/log.txt


This means that whenever any process opens a file, the script captures and displays:

  • PID: Process ID that opened the file
  • Process name: The name of the process (e.g., bash, vim)
  • File path: The path of the file being opened

This example shows how eBPF can be used for real-time monitoring of system events (in this case, file opens) with minimal performance impact.

Best Practices for Non-Invasive Observability With eBPF

1. Start Small and Gradually Increase Scope

Begin by observing high-level metrics (e.g., syscall counts or network traffic) rather than capturing detailed information such as individual function calls. Once you’re confident about performance impacts, gradually expand to more detailed probes.

2. Use Existing Tools Before Writing Custom eBPF Programs

Tools like bcc (BPF Compiler Collection) or bcc-tools, bpftrace, and libbpf offer pre-built solutions for observability and reduce the risk of writing unsafe or inefficient eBPF programs. Examples:

  • execsnoop for tracing process execution
  • opensnoop for tracing file open events
  • tcptracer for observing TCP connections

3. Minimize Overhead by Selecting Efficient Probes

Avoid attaching eBPF programs to high-frequency events (like scheduler-related events or network packet processing) unless necessary. Instead, focus on infrequent events such as specific syscalls or functions. Use kprobes and tracepoints selectively to avoid overhead. Prefer tracepoints when available, as they are designed for observability and impose less overhead.

4. Leverage BPF Maps Effectively

eBPF programs use BPF maps to store and share data between the kernel and user space. Choose appropriate map types (e.g., hash maps, arrays) and ensure they are properly sized to avoid memory overhead. Implement strategies to batch data collection and reduce the frequency of user-space reads.

5. Rate-Limiting and Filtering

Implement rate-limiting in eBPF programs to prevent excessive data collection during high-frequency events.Use filtering to target only specific events or processes, thus reducing the amount of data being collected and processed.

6. Minimize Probe Execution Time

eBPF programs should be designed to run in a short and bounded time to avoid affecting system performance. Kernel enforces time limits on eBPF programs, but even approaching these limits can degrade performance. Avoid complex logic and loops in your eBPF code to ensure that it runs quickly.

7. Monitor Overhead Carefully

Always monitor the overhead introduced by eBPF programs. The Linux kernel provides tools like bpftool to inspect and measure the performance impact of eBPF programs. Measure the following:

  • Latency of the observed application
  • CPU utilization increase due to eBPF programs
  • Memory usage of BPF maps

8. Use BPF Type Format (BTF) for Compatibility

Utilize BTF (BPF Type Format) to ensure that your eBPF programs are portable and work across different kernel versions. BTF allows eBPF programs to access kernel internals without requiring version-specific modifications.

9. Ensure Safety With Verifier Compliance

The kernel’s eBPF verifier ensures the safety of eBPF programs by analyzing them before they are loaded. Keep programs simple and avoid unsafe operations, such as accessing uninitialized data or performing out-of-bounds memory accesses, which could cause verifier failures.

10. Stay Updated With Kernel Versions

The eBPF ecosystem evolves quickly, and newer kernel versions often bring optimizations and new features that improve the performance and safety of eBPF programs. Consider running a newer kernel (e.g., 5.x or later) to benefit from these improvements.

11. Use Off-Critical-Path Data Collection

Where possible, move data collection and heavy processing off the critical path. For example, collect minimal data in the eBPF probe itself and offload more detailed processing to user-space applications.

12. Security Considerations

Make sure eBPF programs are used in a secure manner. Since eBPF operates at the kernel level, improperly written or malicious programs can expose the system to vulnerabilities. Leverage cgroup BPF or LSM BPF to control and restrict eBPF programs.

By adhering to these best practices, you can effectively leverage eBPF for observability without introducing significant overhead or risk to the system. Always focus on simplicity, efficiency, and safety when designing and deploying eBPF programs.

Data collection Linux kernel Observability Data (computing)

Opinions expressed by DZone contributors are their own.

Related

  • Observability for Browsers
  • Azure Observability
  • The Human Side of Logs: What Unstructured Data Is Trying to Tell You
  • Observability Agent Architecture

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!