This post is a walk-through of the simple strace implementation I wrote during my GopherCon talk, A Go Programmer’s Guide to Syscalls. You’ll find the code here.

To explore some of the features of the Linux ptrace syscall I thought it would be fun to write my own implementation of a basic strace — a tool that shows which syscalls an executable uses. This article is a quick breakdown of how the program works. If you have time, there’s more detail and colour in the talk:

Breakpoint the child process

Our program is going to capture all the syscalls made by an arbitrary command that we pass in. It uses exec.Command() to set up whatever command the child process is going to run, and we specify that we want to use ptrace on this child process by setting Ptrace to true in the command’s SysProcAttr struct before starting the command. Here’s this piece of code from the main() function:

fmt.Printf("Run %v

", os.Args[1:]) cmd := exec.Command(os.Args[1], os.Args[2:]...)

cmd.Stderr = os.Stderr

cmd.Stdin = os.Stdin

cmd.Stdout = os.Stdout

cmd.SysProcAttr = &syscall.SysProcAttr{

Ptrace: true,

} cmd.Start()

err := cmd.Wait()

if err != nil {

fmt.Printf("Wait returned: %v

", err)

}

This puts the child process into a breakpoint state as soon as it has been created. If we were to run with just this code in main() , we can see that cmd.Wait() returns with a non-nil error:

root@vm-ubuntu:myStrace# ./myStrace echo hello

Run [echo hello]

Wait returned: stop signal: trace/breakpoint trap

root@vm-ubuntu:myStrace# hello

We can also see that the hello text output gets printed too, which at first glance might seem strange since we’ve just put the child (which does the printing) into a breakpoint state. If you insert a small delay after cmd.Wait() , you’ll see that this won’t happen until the parent completes. What’s happening here is that the parent process holds the child in a breakpoint state, but when the parent exits there is nothing to hold the child up any more — so the child carries on with what it was about to do and displays hello.

Get the current syscall from the child process’s registers

The next step is to find the current values of the registers for the child process (whose process ID can be found in cmd.Process.Pid ). This is done with the PTRACE_GETREGS subcommand of Ptrace. The Go syscall package gives us several functions to make it easy to call various Ptrace subcommands, including this one.

pid = cmd.Process.Pid

err = syscall.PtraceGetRegs(pid, ®s)

This returns a struct showing the current value of all the registers for the child process. On an x86 CPU (which my MacBook Pro has) the syscall identifier is found in the Orig_rax field. ( sec is the import alias I’ve given to the seccomp/libseccomp-golang package.)

name, _ := sec.ScmpSyscall(regs.Orig_rax).GetName()

fmt.Printf("%s

", name)

Run to the next syscall

We now want to allow the child process to proceed until it hits the next syscall. The PTRACE_SYSCALL subcommand does exactly this, and the Go syscall package gives us a function to invoke SYS_PTRACE with that subcommand.

err = syscall.PtraceSyscall(pid, 0)

We’ll get a SIGTRAP when this happens, which we need to wait for.

_, err = syscall.Wait4(pid, nil, 0, nil)

And repeat

At this stage we want to read the registers to get the syscall identifier again, after which we’ll want the child process to run up to the next syscall, and so on— so we can simply add a for loop around this whole process.

We need to stop this loop when the child process finishes. In my simple implementation I simply break out of the for loop when PtraceGetRegs fails. (The error we see is that it is trying to read the registers for a process that doesn’t exist — which makes sense as the child process has finished.)

However

Running this code will generate a list of syscalls, but there’s just one problem: each one is output twice. This is because PTRACE_SYSCALL actually stops the child process both before a syscall is run, and after it completes. Here’s the relevant description from the man page:

I added a boolean called exit to keep track of whether it’s an exit or entry, and simply flipped its state each time through the for loop. I only count the syscall on exit. Here’s the loop, including keeping track of the exit.

for {

if exit {

err = syscall.PtraceGetRegs(pid, ®s)

if err != nil {

break

} name, _ := sec.ScmpSyscall(regs.Orig_rax).GetName()

fmt.Printf("%s

", name)

} err = syscall.PtraceSyscall(pid, 0)

if err != nil {

panic(err)

} _, err = syscall.Wait4(pid, nil, 0, nil)

if err != nil {

panic(err)

} exit = !exit

}

Summing up the syscalls

I wrote some utility code to keep count of the number of times each syscall code is used, and to print out a summary.

Et voilà

If you try this out you’ll see this gives something that corresponds to what strace gives us. Here’s a very short demo showing the output from this code when we use it on echo hello , and the output from strace -c for the same thing. You’ll see they show the same counts for each syscall.

The full implementation also shows the parameters for each syscall. If you wanted to build out our simple version to do this, we could map them from other registers.