Readit News logoReadit News
milianw commented on Hotspot: Linux `perf` GUI for performance analysis   github.com/KDAB/hotspot... · Posted by u/jez
the8472 · 9 months ago
useful for small profiles, but it blows up on larger ones. hotspot handles multi-gigabytes recordings fine on the same hardware.
milianw · 9 months ago
I still think we should find a way to integrate the two somehow - i.e. analyze locally and then send the pre-analyzed data for visualization purposes to the remote firefox profiler. Does anyone know a good format we could use for that purpose? It needs to be non-verbose to not hit the infamous 2GB/4GB JSON heap size limitation on import e.g. Similarly, we also need something that can deal with the various cost types we support in hotspot, most notably off-CPU time.
milianw commented on Hotspot: Linux `perf` GUI for performance analysis   github.com/KDAB/hotspot... · Posted by u/jez
fransje26 · 9 months ago
Great tool that has been really helpful in helping me find unexpected bottlenecks in the codebases I've been working on.

It's easy to use, and pairs beautifully with the unintrusive perf tool, which makes the combination a joy to use.

And, if combined with a codebase opened in QtCreator, you can click on a hotspot in the flamegraph, and it will bring you automagically to the correct file and line in QtCreator, without any explicit linking required between the two programs. I discovered that feature accidentally, and the fact that it just worked seamlessly really impressed me. (Tested on a Debian-based Linux).

A big thanks to KDAB for making this tool available to us!

milianw · 9 months ago
You are welcome :)

And to people using other IDE/editors - you can configure which one gets opened when you click on a source line from the hotspot settings. QtCreator is just the default (when that is installed).

milianw commented on Balcony solar is taking off   theguardian.com/environme... · Posted by u/mcp_
mattlondon · a year ago
Do you just hook up and inverter and run your electricity meter "in reverse" then? Is that how this works?
milianw · a year ago
initially yes but then the local power supplier will come and install a bidirectional meter
milianw commented on Ask HN: Programmers who aren't front/back end/web developers, what is your job?    · Posted by u/superconduct123
milianw · a year ago
C++/Qt GUI applications (embedded, desktop), mostly for a large customer in the medical sector, i.e. microscopy/imaging/cancer research.
milianw commented on Profiling with Ctrl-C   yosefk.com/blog/profiling... · Posted by u/jstanley
kragen · a year ago
the vast majority of embedded cpus cannot run yocto or indeed linux, even the arms

but they all support gdb

milianw · a year ago
True, that's another good point. But again, this reasoning is very different to the one from the linked article and website - if you have oprofile or valgrind's cachegrind available, you clearly could get perf setup instead.

I'm not debating that manual GDB sampling has its place and value. I'm debating that perf is "lying" or that it's impossible to get hold of off-CPU samples, or profiling of multithreaded code in general.

milianw commented on Profiling with Ctrl-C   yosefk.com/blog/profiling... · Posted by u/jstanley
lelanthran · a year ago
> The premise of this website and articles like https://yosefk.com/blog/how-profilers-lie-the-cases-of-gprof... just show that the authors are using the wrong tools. It is nowadays relatively easy to also look at off-CPU time when profiling with perf (e.g. https://github.com/KDAB/hotspot/?tab=readme-ov-file#off-cpu-...).

I think, firstly, that spending 15s trying the CTRL-c approach is a worthwhile tradeoff. If you don't find anything, then sure, spend another 30m - 60m setting up perf, KDAB, etc. Maybe more if you're on an embedded device.

Secondly, the author seems to say that he's used this on embedded devices with no output but a serial line for the debugger. This is also a 15s effort[1].

It's basically a very low effort task, takes seconds to determine if it worked or not, and if it doesn't work you've only lost a few seconds.

[1] I'm assuming that if you're developing on a device supporting a serial GDB connection, you've already got the debugger working.

milianw · a year ago
perf is easily available through yocto and buildroot (and probably other embedded linux image builders). hotspot can be downloaded as an appimage. It should not take 30-60min to set this up, but granted, learning the tools the first time always has some cost.

Furthermore, note how your reasoning is quite different from what the website you linked to says - it basically says "there are no good tools" (which is untrue) whereas you are saying "manual GDB sampling might be good enough and is easier to setup than a good tool" (which is certainly true).

milianw commented on Profiling with Ctrl-C   yosefk.com/blog/profiling... · Posted by u/jstanley
kreyenborgi · a year ago
milianw · a year ago
The premise of this website and articles like https://yosefk.com/blog/how-profilers-lie-the-cases-of-gprof... just show that the authors are using the wrong tools. It is nowadays relatively easy to also look at off-CPU time when profiling with perf (e.g. https://github.com/KDAB/hotspot/?tab=readme-ov-file#off-cpu-...). The idea is to use sampling for the on-CPU periods and then combine that with the off-CPU time measured between context switches. VTune also supported this mode for many years.
milianw commented on Profiling with Ctrl-C   yosefk.com/blog/profiling... · Posted by u/jstanley
dzaima · a year ago
I believe this is essentially what linux perf's "--call-graph dwarf" does. On my system that ends up producing ~33MB/s of recording data for ~4000 samples/s.
milianw · a year ago
With `-z` (zstd compression) you can bring down the disk-space cost of dwarf unwinding by a factor of ~100 based on my personal experience.

to GP: What you describe sounds like https://github.com/koute/not-perf to me

milianw commented on Profiling with Ctrl-C   yosefk.com/blog/profiling... · Posted by u/jstanley
ivoras · a year ago
Speaking of keyboard shortcuts, I miss BSD's Ctrl-T and SIGINFO. It often helped to see if a process was hung.
milianw · a year ago
I don't know exactly what these BSD things did, but there is a super easy way nowadays to get the stack for any process:

    eu-stack -i -p $(pidof ...)
Thanks to debuginfod this will even give you good backtraces right away (at the cost of some initial delay to load the data from the web, consecutive runs are fast). If you get a "permission denied" error, you probably need to tweak kernel.yama.ptrace_scope=0

milianw commented on Profiling with Ctrl-C   yosefk.com/blog/profiling... · Posted by u/jstanley
milianw · 2 years ago
Sad to see people are still unaware of modern perf profilers like hotspot, tracy, vtune, ...

they are easy to setup, work extremely efficient and can nowadays also catch off-cpu time. ctrl+c debugging finds you trivial stuff, but dismissing the real tools as unusable or inferior is simply ignorant.

> Unlike perf, gdb will give you a callstack even if the program was compiled without frame pointer support.

There is `--call-graph dwarf` and it exists since many years, and with `-z` it is pretty efficient and just works too - unless the stack gets too long, but even then it's good enough for profiling purposes...

> Also, sampling profilers are bad at tail latency - if something is usually fast but occasionally slow, you won’t be there to Ctrl-C it when it’s slow.

Very true, thankfully the flight recorder modes for profilers can help with that to a certain degree, with a potentially large sampling frequency.

u/milianw

KarmaCake day31May 23, 2018
About
C++/KDE/Qt developer at KDAB maintainer of heaptrack and hotspot ex-maintainer of KDevelop
View Original