As part of the effort to find idle hogs , I noticed some xterms were heavier than others.

9960 tedu 2 0 7056K 12M sleep select 0:08 0.00% xterm 15257 tedu 2 0 6808K 12M sleep select 0:01 0.00% xterm 10960 tedu 2 0 6924K 12M idle select 0:01 0.00% xterm 25365 tedu 2 0 6796K 12M sleep select 0:01 0.00% xterm

How did inmate 9960 come to acquire 8 whole seconds of CPU time? For that matter, which xterm is it? The answer to the second will likely reveal the first.

Looking around, all my xterms are currently idle. Just as indicated by top. How do we turn a pid into a window?



brute force

The brute, or even brutal, force technique is to quit each xterm one by one until 9960 goes away. A nicer approach is to send SIGSTOP to each xterm and see which one stops responding. (Alas, if xterm is setgid, you may not be able to SIGCONT it afterwards. Less nice.) Or run find / in each xterm while watching top to see who lights up. All a bit intrusive, but wasn’t it Heisenberg who proved there can be no observation without modification? Actually no, though observer effect is a real thing. Nevertheless, we can do a better job of observing xterms without pummeling them to see which one bruises.

Let’s start with ps.

2157 p9 Ss 0:00.02 -ksh (ksh) 17271 p9 R+ 0:00.00 ps 30407 pa Ss 0:00.05 -ksh (ksh) 79 pa S+ 0:00.16 vim kern_sig.c 20564 pb Is+ 0:00.01 -ksh (ksh) 21379 pc Is+ 0:00.03 -ksh (ksh) 236 pd Is 0:00.02 -ksh (ksh) 3583 pd S+ 0:00.38 top

The second column is controlling terminal. So we have some hints. I now know which terminal is running ps, and which is running top, and which is finding out why SIGCONT doesn’t work. But no xterms, unless we run ps x.

9960 ?? Is 0:07.80 xterm 24106 ?? Is 0:00.66 xterm 24358 ?? Is 0:00.25 xterm

xterms don’t have controlling terminals; instead they control the terminal. But this is still useful info to have.

> fstat -p 9960 USER CMD PID FD MOUNT INUM MODE R/W SZ|DV tedu xterm 9960 text /usr 702660 -rwxr-sr-x r 596224 tedu xterm 9960 wd /home 1611008 drwxr-xr-x r 2560 tedu xterm 9960 0 / 182659 crw------- rw ttyC0 tedu xterm 9960 1 / 183026 crw-rw-rw- w null tedu xterm 9960 2 / 183026 crw-rw-rw- w null tedu xterm 9960 3* unix stream 0x0 tedu xterm 9960 4 / 182379 crw-rw-rw- rw ptyp1

There it is. We’re looking at p1.

> pgrep -lf -t p1 6988 -ksh

Another approach is would be to run ps -O ppid (or pgrep -lf -P 9960 ) and look for the shell with a parent of 9960, and walk back up. Either way, it’s one of the dozen xterms sitting there with an idle shell, which is a hint not an answer. Running around and pasting echo $$ in each shell would find the suspect. Or I could run write tedu ttyp1 and look for the graffiti.

We can also continue further on this path, inspecting the working directory for each shell, and then narrowing our search to those xterms, but maybe it’s time to switch techniques.



just ask

A smarter approach would be to just ask. In theory, every xterm has a _NET_WM_PID property that is equal to its pid. This can be retrieved by running xprop and clicking the window. Or using the -id argument. Then we need all the xterm window IDs, which can be obtained via xwininfo.

> xwininfo -root -children | grep XTerm | awk '{print $1}' | \ xargs -n1 -I % sh -c "echo %; xprop -id % _NET_WM_PID" 0xe0000d _NET_WM_PID(CARDINAL) = 24106 0xc0000d _NET_WM_PID(CARDINAL) = 9960 0xa0000d _NET_WM_PID(CARDINAL) = 25365

Armed with the window ID, we can feed it back to xwininfo.

> xwininfo -id 0xc0000d xwininfo: Window id: 0xc0000d "Thanks for flying Vim" Corners: +-2542+15 -3831+15 -3831-12 +-2542-12 -geometry 115x67+-2542-12

Alight, so this xterm is off screen somewhere, but the geometry maybe gives us another hint as to which it is based on size. And it once upon a time ran vim, which fiddles with the title. Interesting, but we’d like something a little more obvious.

> xwd -id 0xc0000d | xwud X Error of failed request: BadMatch (invalid parameter attributes) Major opcode of failed request: 73 (X_GetImage) Serial number of failed request: 95 Current serial number in output stream: 95 xwud: Error => Unable to read dump file header. xwud: Resource temporarily unavailable

Damn. I was hoping for Woah! A new exact duplicate of 9960 has appeared. So that’s which one it is. but no dice. Depends on the suspect window being on screen. But if we can get all the windows on screen (dwm “0” screen) either this or the above approach can work.

For funsies, there’s a Stack Overflow answer dedicated to finding the pid for an X11 window, which is the reverse process.



inferno

We’re moving well past the point of no return now. Instead of using X to spy on our xterm, we can do so ourselves. This can be done using gdb, for instance. Unfortunately, other people would do it that way. How hard can it be to write a one off single purpose debugger?

Step one of our journey is gazing into the xterm source code. Eventually one will discover that there is a LineData structure with a pointer to what appears to be character data. There’s an array of these, one for each line. But there is not an obvious pointer to this array. Instead it’s accessed using a variety of casts, offsets, and pointer arithmetic, but the base pointer is visbuf in something called TScreen, a giant structure that takes over 500 lines of code to declare. That is embedded in an XTermWidget, and (thank the heavens!) there is a global pointer to one of these called term, bringing our trek to an end.

All we need to do now is write a debugger that iteratively reads each:

((LineData *)(term->screen.visbuf + offset))->chardata .

OpenBSD includes a useful sysctl for examining the address space of another process. Through arcane magic not explained here (procmap), I know the xterm I’m looking at has a text segment of 540672 bytes. We can find it programmatically thusly:

local function findexecbase ( pid , execsize ) local mib = ffi . new ( "int[3]" ) mib [ 0 ] = CTL_KERN mib [ 1 ] = KERN_PROC_VMMAP mib [ 2 ] = pid local numents = 200 local ents = ffi . new ( "struct vmentry[?]" , numents ) local entsize = ffi . sizeof ( "struct vmentry" ) local oldsize = ffi . new ( "size_t[1]" ) oldsize [ 0 ] = entsize * numents local rv = C . sysctl ( mib , 3 , ents , oldsize , nil , 0 ) if rv == - 1 then return nil end for i = 0 , tonumber ( oldsize [ 0 ]) / numents - 1 do local ent = ents [ i ] if ( tonumber ( ent . kve_end ) - tonumber ( ent . kve_start )) == execsize and ent . kve_prot == PROT_RW then return ent . kve_start end end end local addr = findexecbase ( pid , 540672 )

Using further magic (I’m cheating a bit, but basically nm xterm | grep term$ ), we know the offset from there to term, and then we can start chasing pointers with ptrace. Offsets calculated by compiling an xterm with a printf of interesting values.

local function pread ( addr ) local v = C . ptrace ( PT_READ_D , pid , addr , 0 ) v = tonumber ( v ) if v < 0 then v = v + 4294967296 end return v end local function preadptr ( addr ) local p1 = pread ( addr ) local p2 = pread ( addr + 4 ) return p1 + p2 * 4294967296 end rv = C . ptrace ( PT_ATTACH , pid , 0 , 0 ) rv = C . waitpid ( pid , nil , 0 ) addr = addr + 4912632 -- offset of term addr = preadptr ( addr ) --read term addr = addr + 392 -- offset of term->screen addr = addr + 15496 -- offset of screen.visbuf addr = preadptr ( addr ) print ( "SCREEN DUMP" ) for row = 0 , 10 do local datadr = preadptr ( addr + row * 48 + 24 ) local s = { } for i = 0 , 80 do local v = pread ( datadr + i * 4 ) table . insert ( s , string . char ( v )) end print ( table . concat ( s )) end rv = C . ptrace ( PT_DETACH , pid , 0 , 0 )

Let it rip and...

SCREEN DUMP if (row >= 0 && row <= max_row) { result = (LineData *) scrnHeadAddr(screen, buffer, (unsigned) row); if (result != 0) { #if 1 /* FIXME - these should be done in setupLineData, result->lineSize = (Dimension) MaxCols(screen); #if OPT_WIDE_CHARS if (screen->wide_chars) { result->combSize = (Char) screen->max_combining; } else { result->combSize = 0; }

Hey! Now that does look familiar. It’s the source code to the line getting function in xterm. Now I know exactly which window it is.



epilogue

This was a pretty big waste of time. As soon as I saw that one xterm was busier than the rest, I knew exactly which one it was: the one I read mail in, which has to redraw the screen for every email. This was trivially confirmed using any of the brute force techniques which work well enough with some educated guesswork guiding them. Learning to script gdb may have been faster, but a lot less fun.