Readit News logoReadit News
jackhalford · 2 years ago
Looks great. I remember using bpftrace at work to debug a nasty performance issue, I went down the rabbit hole only to find a certain syscall was being called much too often. We managed to trace it back inside the sourcecode to a sleep(1 second), which was some sort of manual io scheduling commited by the CTO when the startup was early stage. Removing those few lines and installing kyber fixed the issues!
gtirloni · 2 years ago
Interesting, what was the storage type?
bschuur · 2 years ago
Just tried this on the eBPF code I am working on. Works great! This one is going straight into the toolbox.

Even though eBPF is super fast, I have found triggering complex probes many times will have performance implications, which can not easily be tracked down to the instrumenting application. This will help with that a lot.

dirtyhand · 2 years ago
I am happy to hear that you had a good first impression. At Netflix, we do some Linux scheduler instrumentation with eBPF and overhead matters. I was inspired to create the tool to enable the traditional performance work loop: get a baseline, tweak code, get another reading, rinse & repeat.
maayank · 2 years ago
What use cases people use eBPF for these days?
darkr · 2 years ago
We had a recent use case to log outbound TCP connections _excluding_ internal and known addresses from our k8s infrastructure, with the log including the process name/pid, uid a bunch of other metadata.

I wrote a tool that compiles to a small, statically linked binary (using CO-RE/libbpf), deployed to every node as a DaemonSet. It just works and uses minimal CPU and memory resources.

stefan_ · 2 years ago
Tongue in cheek: lots of people have discovered they can replace Linux kernel modules with brittle eBPF code instead, which attaches itself to various parts of the kernel that are even less stable than the things modules have to deal with.
gtirloni · 2 years ago
They are nice for quick experimentation, yes. But there are rock solid projects like Cilium using them. I think your point is that the barrier to abuse is lower?
riv991 · 2 years ago
The eBPF website has a list of projects using it, that can give you a decent flavour of what people use it for. https://ebpf.io/applications/
llotter · 2 years ago
Stackstate, my current employer uses eBPF in addition to Open Telemetry for collecting observability data. https://www.stackstate.com/platform/features/
tptacek · 2 years ago
We use it for several parts of our network forwarding path (our private networking features are built in eBPF), for a variety of monitoring purposes, and (principally with bpftrace) as a debugging tool.
sharangxy · 2 years ago
We have implemented zero-code distributed tracing with eBPF. https://github.com/deepflowio/deepflow
yla92 · 2 years ago
Using eBPF based tools (like bcc) to debug the issues https://github.com/iovisor/bcc
klysm · 2 years ago
I find this somewhat amusing given one of the primary use cases of eBPF is measuring performance
ongy · 2 years ago
There's plenty other applications.

Network routing can be implemented in (e)bpf. It's even the original use case.

But there's also the LSM based on ebpf, there's a user space scheduler (Google iirc), seccomp and some cgroup filters can be done in bpf...

It's the Lua of the kernel at this point. Provides a lot of extension points.

tubs · 2 years ago
sched-ext is meta, Google have something else but less open i believe.

https://github.com/sched-ext/scx/blob/main/OVERVIEW.md

dirtyhand · 2 years ago
Who watches the Watchmen?
bewo001 · 2 years ago
Nice! But I got it to freeze under higher load. Removing the load does not help.
dirtyhand · 2 years ago
bpftop author here. Would you mind creating an issue to track this? https://github.com/Netflix/bpftop/issues
bewo001 · 2 years ago
done (https://github.com/Netflix/bpftop/issues/17). Seems to be some futex issue, the kind of bugs that tend to be hard to replicate.