hotspot – a GUI for the Linux perf profiler First public release of hotspot v1.0.0 available

After many months of work, I’m very pleased to finally announce KDAB’s latest R&D project to the public: hotspot – a GUI for the Linux perf profiler.

I have used Linux perf a lot over the past years. It is an extremely powerful and useful tool. But its complexity makes it very hard to use. You need to know and enable quite a few options of perf report before the results become understandable. To educate people, I started giving talks on the subject at various conferences. Additionally, KDAB now also offers a training about Debugging and Profiling tools, with a large section on Linux perf. Finally, I started contributing upstream to make perf more usable for my needs as a C++ application developer.

But all this time I kept thinking: Why can’t we have a proper GUI for Linux perf? Something similar to KCachegrind would already be quite helpful. And I knew I’m not the only one with that wish – over the years I heard many people complain about a lack of a decent GUI for Linux perf. Thankfully, KDAB gave me the opportunity to fix this issue. Today, I am finally confident enough to release a first version of hotspot to the wider public. Download hotspot v1.0.0 from GitHub, it’s free and open source software!

What is hotspot?

First and foremost, hotspot is a replacement for perf report . It is a GUI that takes a perf.data file, parses and evaluates its contents and then displays the result in a graphical way. The screenshots show the most important current features in action:

hotspot summary view The summary view of hotspot gives a quick overview of the analyzed perf.data file and the system it was recorded on.

Compared to perf report , hotspot has the following advantages in my opinion:

The interactive GUI with tooltips makes it easier to analyze the data in different ways. And you don’t need to restart the tool to switch between top-down or bottom-up views. You can also easily search the views for symbols or change the sort order on the fly.

Overall, we aim at making hotspot as intuitive as possible – it will be a long way but already it’s quite usable. We always include inline frames and show (in the caller/callee view) the source file and line information. Just open a perf.data file with hotspot, no need to remember the magic invocation perf report -g srcline -s dso,sym,srcline --inline .

file with hotspot, no need to remember the magic invocation . Additionally, hotspot has the ability to display multiple events in one view, as shown in the screenshots above. E.g. when you record data with perf record -e cycles,instructions , perf report will show you a menu to select either cycles, or instructions. But it does not show both costs side-by-side.

, will show you a menu to select either cycles, or instructions. But it does not show both costs side-by-side. The top-down, bottom-up and caller/callee aggregations are hopefully more intuitive in hotspot. I have seen many people struggle with understanding the default aggregation of perf report , which is a combination of a Caller/Callee view and the Top-Down view. Hotspot splits uses visualizations known from other tools like KCachegrind, VTune or heaptrack, making it more familiar and easier to understand for newcomers.

, which is a combination of a Caller/Callee view and the Top-Down view. Hotspot splits uses visualizations known from other tools like KCachegrind, VTune or heaptrack, making it more familiar and easier to understand for newcomers. The built-in flamegraph is a killer feature. It is easily the most important feature of hotspot. I have used it many times already to find optimization opportunities even in large code bases.

Full support for cross-machine analysis as required for embedded development. perf report is, even today, lacking proper cross-platform unwinding support to analyze a perf.data file recorded on a 32bit ARM machine on a x86(-64) machine. The other combinations between aarch64 and x86(-64) work, but at least in my day-to-day work, I still have to work a lot with 32bit ARM embedded boards, making perf report rather useless. Additionally, perf report only knows the single --symfs command line switch to set the sysroot. For many situations, this is not enough, e.g. when you deploy hand-compiled code manually from outside the sysroot. Hotspot comes with four options for that purpose: --sysroot , --debugPaths , --appPath , --extraLibPaths . Have a look at hotspot --help for more information, or head over to the official README.

Overall, I encourage you to try it out. Hotspot depends on a few external libraries, such as elfutils, Qt 5.6+ and some KDE Frameworks. On modern distributions, it should be straightforward to compile it. If you have any issues, read the documentation, and report bugs you encounter in the issue tracker.

What is hotspot not (yet)?

I have so far concentrated on making hotspot a good replacement for the common perf report use-case. But the perf tool suite is extremely large and versatile. Many of the more advanced analysis routines are not yet supported by hotspot. On the roadmap for the next release of hotspot, we plan to add the following features:

Off-CPU profiling via the scheduler tracepoints, to find lock contention, sleeps and I/O wait time

Visualization of events in a per-thread timeline (proof of concept is available in the wip/timeline branch)

branch) Filtering by time slices selected in the timeline

Extended aggregation and filtering capabilities, i.e. by: CPU core process thread DSO file

Further improve the usability by adding more context-sensitive documentation and information

Directly run perf record from within hotspot, such that you don’t need to type perf --call-graph dwarf all the time. My colleague Nate is working on that already.

In the long term, I also consider making hotspot a generic GUI for all kinds of performance data – it could e.g. become the analysis frontend for heaptrack data.

Also note that hotspot is intended to be a standalone application. I have no intention to integrate it in an IDE. If you are looking for such a thing, have a look at QtCreator’s CPU Usage Analyzer. It shares the same backend with hotspot and has many of the same visualizations available in hotspot.

If you are interested in the project and want to help out, please contribute! You can submit patches via GitHub pull requests.

Thanks

I would like to end this announcement by saying thank you to a couple of people who have helped me bring this project to fruition:

My employer KDAB, for setting aside a budget for such R&D projects and allowing me to release the result as FOSS

My colleagues at KDAB who have contributed a lot to this project, giving me valuable feedback and reporting bugs early

Ulf Hermann from The Qt Company, the author of perfparser and the CPU Usage Analyzer in Qt Creator. Without his prior work, I would be far from where I’m today. I hope my contributions upstream make up for it!

and the CPU Usage Analyzer in Qt Creator. Without his prior work, I would be far from where I’m today. I hope my contributions upstream make up for it! All the people who have worked on Linux perf over the years – it’s amazing to see how powerful it has become!