Though I can intuitively get most of the results, I'm having hard time fully understanding the output of the perf report command especially for what concerns the call graph, so I wrote a stupid test to solve this issue of mine once for all.

The stupid test

I compiled what follows with:

gcc -Wall -pedantic -lm perf-test.c -o perf-test

No aggressive optimizations to avoid inlining and such.

#include <math.h> #define N 10000000UL #define USELESSNESS(n) \ do { \ unsigned long i; \ double x = 42; \ for (i = 0; i < (n); i++) x = sin(x); \ } while (0) void baz() { USELESSNESS(N); } void bar() { USELESSNESS(2 * N); baz(); } void foo() { USELESSNESS(3 * N); bar(); baz(); } int main() { foo(); return 0; }

Flat profiling

perf record ./perf-test perf report

With these I get:

94,44% perf-test libm-2.19.so [.] __sin_sse2 2,09% perf-test perf-test [.] sin@plt 1,24% perf-test perf-test [.] foo 0,85% perf-test perf-test [.] baz 0,83% perf-test perf-test [.] bar

Which sounds reasonable since the heavy work is actually performed by __sin_sse2 and sin@plt is probably just a wrapper, while the overhead of my functions take into account just the loop, overall: 3*N iterations for foo , 2*N for the other two.

Hierarchical profiling

perf record -g ./perf-test perf report -G perf report

Now the overhead columns that I get are two: Children (the output is sorted by this one by default) and Self (the same overhead of the flat profile).

Here is where I start feeling I miss something: regardless of the fact that I use -G or not I'm unable to explain the hierarchy in terms of "x calls y" or "y is called by x", for example:

without -G ("y is called by x"): - 94,34% 94,06% perf-test libm-2.19.so [.] __sin_sse2 - __sin_sse2 + 43,67% foo + 41,45% main + 14,88% bar - 37,73% 0,00% perf-test perf-test [.] main main __libc_start_main - 23,41% 1,35% perf-test perf-test [.] foo foo main __libc_start_main - 6,43% 0,83% perf-test perf-test [.] bar bar foo main __libc_start_main - 0,98% 0,98% perf-test perf-test [.] baz - baz + 54,71% foo + 45,29% bar Why __sin_sse2 is called by main (indirectly?), foo and bar but not by baz ? Why functions sometimes have a percent and a hierarchy attached (e.g., the last instance of baz ) and sometimes not (e.g., the last instance of bar )?

with -G ("x calls y"): - 94,34% 94,06% perf-test libm-2.19.so [.] __sin_sse2 + __sin_sse2 + __libc_start_main + main - 37,73% 0,00% perf-test perf-test [.] main - main + 62,05% foo + 35,73% __sin_sse2 2,23% sin@plt - 23,41% 1,35% perf-test perf-test [.] foo - foo + 64,40% __sin_sse2 + 29,18% bar + 3,98% sin@plt 2,44% baz __libc_start_main main foo How should I interpret the first three entries under __sin_sse2 ? main calls foo and that's ok, but why if it calls __sin_sse2 and sin@plt (indirectly?) it does not also call bar and baz ? Why do __libc_start_main and main appear under foo ? And why foo appears twice?



Suspect is that there are two levels of this hierarchy, in which the second actually represents the "x calls y"/"y is called by x" semantics, but I'm tired to guess so I'm asking here. And the documentation doesn't seem to help.

Sorry for the long post but I hope that all this context may help or act as a reference for someone else too.