SystemTap 3.1 has been released

@variance()

The SystemTap team has announced the 3.1 release of the tool that allows extracting performance and debugging information at runtime from the kernel as well as various user-space programs. New features include support for adding probes to Python 2 and 3 functions, Java probes now convert all parameters to strings before passing them to probes, a newstatistical operator has been added, new sample scripts have been added, and more.

From: Cody Santing <csanting-AT-redhat.com> To: systemtap-AT-sourceware.org Subject: systemtap 3.1 release Date: Fri, 17 Feb 2017 13:32:23 -0500 Message-ID: <CAO8qvm5zY2oMfFddqRoXt_zrqj1=TyPOqUtK5m49Ug6eX8_yaQ@mail.gmail.com> Cc: linux-kernel-AT-vger.kernel.org, lwn-AT-lwn.net

The SystemTap team announces release 3.1! Highlights include syscall probes default to non-dwarf fallback, python function probes, @variance and statistics optimizations, java probe argument generalization, and user-space value-setting functions. = Where to get it https://sourceware.org/systemtap/ - our project page https://sourceware.org/systemtap/ftp/releases/systemtap-3... https://koji.fedoraproject.org/koji/packageinfo?packageID... git tag release-3.1 (commit b8ea350dc13adb) There have been 950ish commits since the last release. There have been 84 "features" removed/features added. = How to build it See the README and NEWS files at https://sourceware.org/git/?p=systemtap.git;a=tree Further information at https://sourceware.org/systemtap/wiki/ - Systemtap now needs C++11 to build = SystemTap frontend (stap) changes - Systemtap now warns if script arguments given on the command line are unused, instead of mentioned by the script with $n/@n. - New -T option allows the script to be terminated after a specified number of seconds. This is a shortcut for adding the probe, timer.s(N) {exit()}. = SystemTap script language changes - Support has been added for probing python 2 and 3 functions using a custom python helper module. Python function probes can target function entry, returns, or specific line numbers. probe python2.module("myscript").function("foo") { println($$parms) } To run with the custom python helper module, you'd use python's '-m' option like the following: stap myscript.stp -c "python -m HelperSDT myscript.py" - Java method probes now convert all types of java parameters to strings using the java toString() method before passing them to systemtap probes; new argN variables copy them into string variables. Previously, only numeric types were passed, and only by casting to integers. The previous behaviour is available with --compatible=3.0 . 3.1: probe java(...).class(...).method(...) { printf("%s", arg1) } 3.0: probe java(...).class(...).method(...) { printf("%d", $arg1) } - Context variables in .return probes should be accessed with @entry($var) rather than $var, to make it clear that entry-time snapshots are being used. The latter construct now generates a warning. Availability testing with either @defined(@entry($var)) or @defined($var) works. - New statistics @variance() operator using the Welford's online algorithm for per-cpu computation, and the Total Variance formula authored by Niranjan Kamat and Arnab Nandi from the Ohio State University for the cross-cpu aggregation. - The implementation of "var <<< X" for each aggregate variable is now specially compiled to compute only the script-requested @op(var) values, not all potential ones. This speeds up the <<< operations. - Translator now accepts new @const() operator for convenient expressing constants in tapset code, or guru-mode scripts. See stap(1) for details. = SystemTap runtime changes - An older defensive measure to suppress kernel kprobes optimizations since the 3.x era has been disabled for recent kernels. This improves the performance of kernel function probes. In case of related problems, please report and work around with: # echo 0 > /proc/sys/debug/kprobes-optimization - New installcheck-parallel testsuite feature allows running the tests in parallel in order to save time. See testsuite/README for details. = SystemTap tapset changes - Syscall and nd_syscall tapsets have been merged in a way that either dwarf-based, or non-dwarf probe gets automatically used based on debuginfo availability (e.g. probe syscall.open). To force use the dwarf based probe, a dw_syscall has been introduced (e.g. probe dw_syscall.open) and the non-dwarf syscall probes were left untouched (e.g. nd_syscall.open). - The syscall tapset files have been reorganized in a way that original big tapset files carrying many syscall probes were split into smaller 'sysc_' prefixed tapset files. This should reduce the syscall tapset maintenance burden. - The powerpc variant of syscall.compat_sysctl got deprecated on favor of syscall.sysctl32. This aligns the syscall to its respective nd_syscall and to ia64/s390/x86_64 variants too. - The syscall.compat_pselect7a (this was actually a typo, but still available for compatibility purposes with --compatible 1.3) has beed deprecated. - The 'description_auddr' convenience variable of syscall.add_key has been deprecated. - Tapsets containing process probes may now be placed in the special $prefix/share/systemtap/tapset/PATH/ directory to have their process parameter prefixed with the location of the tapset. For example, process("foo").function("NAME") expands to process("/usr/bin/foo").function("NAME") when placed in $prefix/share/systemtap/tapset/PATH/usr/bin/ This is intended to help write more reusable tapsets for userspace binaries. - Netfilter tapsets now provide variables data_hex and data_str to display packet contents in hexadecimal and ASCII respectively. - New tapset functions set_user_string(), set_user_string_n(), set_user_long() set_user_int(), set_user_short(), set_user_char() and set_user_pointer() to write a value of specified type directly to a user space address. - New tapset functions user_buffer_quoted(), user_buffer_quoted_error(), kernel_buffer_quoted(), and kernel_buffer_quoted_error() to print a buffer of an exact length. These functions can handle '\0' characters as well. = SystemTap sample scripts All 163 examples can be found at https://sourceware.org/systemtap/examples/ - New Samples: socket-events.stp Prints the life cycle of all sockets associated with a process. This includes bytes and timing. The timing information that is tracked includes event completion relative to the start of said event and the end of the previous event. Currently tracks read, write, recv, send, connect and close. nfsd-trace.stp This script traces all nfsd server operations by client_ip address, operation, and complete file name (if possible). packet_contents.stp The packet_contents.stp script displays the length of each network packet and its contents in both hexadecimal and ASCII. Systemtap strings are MAXSTRINGLEN in length by default which may not be enough for larger packets. In order to print larger packets, this limit can be increased by passing in the "-DMAXSTRINGLEN=65536" command line option. tcp_retransmission.stp The tcp_retransmission.stp prints out a line for each tcp retransmission packet. sched-latency.stp This script periodically reports a histogram of the latency between a task (thread) being woken up and it actually being dispatched to a CPU: the amount of time it's spent in the runnable queue. container_check.stp The container_check.stp script monitors the use of linux capablities and optionally forbidden syscalls by a process and its children. On exit the script prints out lists showing the capabilies used by each executable, which syscall used specific capabilites for each executable, a list of forbidden syscalls used, and details on any syscalls that failed during monitoring. This script is designed to help diagnose issues caused by restricted capabilies and syscalls when running an application in a container. If the script warns about skipped probes, the number of active kretprobes may need to be increased with "-DKRETACTIVE=100" option on the command line. cve-2016-5195.stp historical emergency security band-aid, for reference/education only. - New command within interactive mode, sample. Allows you to search through all included example scripts to load for further editing or running. Sample and example scripts have been moved to /usr/share/systemtap/examples. A symlink in the former location under $docdir links to it. = Examples of tested kernel versions 2.6.18 (RHEL 5 x86 and x86_64) 2.6.32 (RHEL 6 x86 and x86_64) 3.10.0 (RHEL 7 x86_64) 4.1.6 (Fedora 22 x86_64) 4.3.4 (Fedora 22 x86_64) 4.6.0-rc0 (Fedora rawhide x86_64) 4.6.0-rc6 (Fedora rawhide x86_64) 4.8.10-200 (Fedora 24 x86_64) 4.10.0-rc0 (Fedora rawhide x86_64) 4.10.0-rc6 (Fedora rawhide x86_64) 4.10.0-rc8 (Fedora rawhide x86_64) = Known issues with this release - Some kernel crashes continue to be reported when a script probes broad kernel function wildcards. (PR2725) - An upstream kernel commit #2062afb4f804a put "-fno-var-tracking-assignments" into KCFLAGS, reducing debuginfo quality which can cause debuginfo failures. A proposed workaround to this issue exists in: https://lkml.org/lkml/2014/11/21/505 . Fedora kernels are not affected by this issue. = Contributors for this release Abegail Jakop, Alexander Lochmann, Benjamin Coddington*, Bingwu Yang*, Cody Santing*, David Smith, Felix Lu, Francis Giraldeau*, Frank Ch. Eigler, Hemant Kumar, Igor Zhbanov*, Joe Gorse*, Josh Stone, Kyle Walker*, Lukas Berk, Marcin Nowakowski*, Mark Wielaard, Martin Cermak, Masanari Iida, Mateusz Guzik*, Michal Toman*, Nikolay Borisov, Petr Matousek, Ravi Bangoria*, Ross Burton*, Tetsuo Handa, Torsten Polle, William Cohen Special thanks to new contributors, marked with '*' above. Special thanks to Cody Santing for drafting these notes. = Bugs fixed for this release <https://sourceware.org/PR#####> 6978 process.syscall extensions: abort, $$parms 10234 clean up aggregate hard-coded logic 10485 auto-path tapset support for process.* probes 10655 SDT semaphores should be prepared for multiple tasks per probe 10791 parallelize systemtap testsuite 11308 aggregate operations for @variance, @skew, @kurtosis 11637 set_user_* functions 12596 blacklist is too broad (raw_.*) 12748 need syscall-number database in tapset 14787 consider making stap -L output prettier/more structured 14924 warn on complex $ptr->foo expressions in .return probes 15076 Merge MIPS patches from Cisco 15671 systemtap (rpm version) can't find debuginfo for @var() use 15932 %m/%M should have a variant that reads user memory (instead of kernel memory) 17055 _stp_perf_read needs a sleepable context 17231 sysroot is too often prepended 17962 dtrace.exp --no-parsing fallback test fails on rhel6 18079 autocast doesn't work with @defined 19489 printing array from memory 19624 Duplicate function parameter names are not detected 19802 bad hash value distribution and horrible performance for large arrays with multiple small-integer indices 19873 staprun -o /NO/SUCH/FILE -c CMD imperfect cleanup 19874 stap -c CMD run-time limited to 60s due to uncleared alarm() 19875 membarrier missing from syscall tapset 19876 userfaultfd missing from syscall tapset 19882 copy_file_range missing from syscall tapset 19905 preadv2/pwritev2 missing from syscall tapset 19906 file name lookups in vfs etc. tapsets 19915 flight recorder's "logrorate" feature broken 19926 we need a better way to express constants in tapset code 19940 page_cache_release() missing from the latest rawhide kernel 19953 netfilter tapsets should provide variables to assist printing of packet contents 19954 "suspicious RCU usage" message on rawhide 19990 on rawhide, the get_user_pages() function has changed 19992 polymorphic operation 20013 stap --dump-functions broken 20040 the task_exe_file function getting "BUG: sleeping function called from invalid context" 20042 on rawhide, tracepoint handlers have a changed function signature 20056 improve parse error message involving expect_op("...") 20064 Linking stapio failed because of misplaced libraries flags 20065 Configure script is not in sync with configure.ac 20122 use base os toolchain consistently in the developer toolset environment 20131 listing_mode.exp wildcard library path failures 20132 on rawhide, struct inode has changed 20136 Use the @const() operator across the tapset scripts. 20149 a function probe with a line number acts like a statement probe 20158 on kernel 4.6, print_backtrace() gets a compile error 20161 VM_FAULT_MINOR has been removed from rawhide kernels 20187 on rawhide, the 'size' convience variable of socket.recvmsg doesn't work 20189 on rawhide, PAGE_CACHE_SIZE is no longer defined (which breaks the vfs tapset) 20192 "suspicious RCU usage." warning from kernel when running testsuite 20211 testsuite resume feature 20217 warn for degenerate case overloaded functions 20236 code cleanup: simplify user/kernel memory access routines 20281 probe process("") kills stap with SIGABRT 20282 implicit declaration of function ‘__get_user_bad’ on recent aarch64 kernel 20286 probe handlers using hrtimers taking too long 20298 the unprivileged_embedded_C.exp testcase needs updating 20307 'private' on tapset global arrays causes errors 20333 merge syscall and nd_syscall tapsets 20416 @entry(@perf("foo")) not translated correctly 20423 improve error message for dwarf $var 'struct ... being accessed instead of member' 20433 "NULL pointer dereference" crash on fedora 20504 trouble finding some tracepoints on kernel 4.7+ 20510 stap -L colorizes non-tty stdout 20589 kernel warning from calling kernel_buffer_quoted() 20594 Compile error on GCC 6.1.1: misleading indentation 20597 broken @avg() calculations 20599 histogram breaks @variance 20601 __get_skb_iphdr() failing on 32-bit rawhide 20672 @defined(@cast()) regression 20735 "soft lockup" bug on RHEL7 ppc64 20820 another "soft lockup" BUG on RHEL7 ppc64 20821 @defined(@entry($var)) does not nest correctly 20850 The systemtap boot time probing feature doesn't work on rhel6 20879 For stap -t, print out global variable contention report 20889 metadatabase.db location 20982 function::stack doesn't descend if _stack_raw() fails 21020 reorganize argument passing from java probes 21063 dtrace script causes mysterious build failures due to improper forming of gcc command line 21065 dtrace script reports syntax error for valid .d files 21101 errors when compiling a systemtap module with gcc 7 21102 the ioblock.stp tapset needs to be updated 21105 syscall testsuite failures on rawhide