I fixed a bug in Perl 5 this morning. I'm not asking for praise; lots of people fix bugs in Perl 5. In particular, Perl 5 pumpkings deserve tremendous praise for managing the bug queue -- and often fixing nasty bugs no one else wants to explore.

The bug was small and the problem seemed obvious. The fix was simple. Thus the process of fixing the bug may be valuable to document. In particular, corehackers exists in part to recruit new developers. Its articles about Perl 5 internals needs more information.

Here's a start; here's how I fixed a small, obvious bug in Perl 5.

The Bug

Dave Taylor reported a bug with the syswrite builtin. Using syswrite on an empty string with an offset writes garbage to a filehandle:

$ /usr/local/refperl/5.10.0/bin/perl -e 'my $foo = ""; syswrite STDOUT, $foo, 100, 1' | less <DC>8 /null^@^@^@^Y^@^@^@^A^@^@^@ ^@^@^@^P^@^@^@X^V9 ^@^@^@^@<89>^@^@^@<80><A6>^U^H^@^@^@^@^@<A6>^U^H^@^@^@^@<80><A7>^U^H^@^@^@^@^@ <A7>^U^H^@^@^@^@<80><A8>^U^H^@^@^@^@^@<A9>^U^H^@^@^@^@^@<A5>^U^H^@^@^@^@<80><A4>^U^H^@ (END)

His test case is simple and his diagnosis looks reasonable.

At this point, the bug is obvious to a C programmer. Even though Perl 5 knows that $foo is an empty string, the syswrite code attempts to read 100 characters from the string starting from the second character of the string (at offset 1). The garbage shown in the bug report is whatever's in memory one byte after the memory address of the null string in $foo .

Whatever reads this data to write with syswrite checks the length of the string in $foo incorrectly.

Finding the Culprit

That seemed like a reasonable hypothesis. Where was this code?

I already knew that it was likely in a C function called pp_syswrite in one of the pp_*.c files in the Perl 5 core. If I hadn't known this, I could have used B::Concise to figure out where to look:

$ perl -MO=Concise -e 'my $foo = ""; syswrite STDOUT, $foo, 100, 1' d <@> leave[1 ref] vKP/REFC ->(end) 1 <0> enter ->2 2 <;> nextstate(main 1 -e:1) v:{ ->3 5 <2> sassign vKS/2 ->6 3 <$> const[PV ""] s ->4 4 <0> padsv[$foo:1,2] sRM*/LVINTRO ->5 6 <;> nextstate(main 2 -e:1) v:{ ->7 c <@> syswrite[t3] vK/4 ->d 7 <0> pushmark s ->8 8 <#> gv[*STDOUT] s ->9 9 <0> padsv[$foo:1,2] s ->a a <$> const[IV 100] s ->b b <$> const[IV 1] s ->c -e syntax OK

B::Concise compiles a snippet of code into an optree, then walks that optree and serializes it to textual output. That output represents the operations Perl 5 performs when it runs the program. I've emboldened the important line. That line shows a LISTOP (a type of node in the optree which has multiple child nodes) which performs an operation called syswrite .

The node type isn't important; what's important is the name of the operation. Each operation implies the existence of a function in the Perl 5 core called pp_operation . PP is, as I understand it, short for push/pop, which means that these functions operate on the Perl 5 stack (more or less equivalent to @_ in Perl 5 code).

I used App::Ack to search for pp_syswrite in the appropriate *.c files. It led me to mathoms.c, which is sort of a limbo for functions that used to be in the core and exist now for compatibility reasons. pp_syswrite is now:

PP(pp_syswrite) { return pp_send(); }

Thus the implementation of syswrite is in the function pp_send() , which is in pp_sys.c. That function is too long to reproduce here, but there's an interesting branch around 80 lines in:

if (op_type == OP_SYSWRITE) { Size_t length = 0; /* This length is in characters. */ STRLEN blen_chars; /* set blen_chars to the length of the string in C chars */ /* ... */ if (MARK < SP) { offset = SvIVx(*++MARK); if (offset < 0) { if (-offset > (IV)blen_chars) { Safefree(tmpbuf); DIE(aTHX_ "Offset outside string"); } offset += blen_chars; } else if (offset >= (IV)blen_chars && blen_chars > 0) { Safefree(tmpbuf); DIE(aTHX_ "Offset outside string"); } } else offset = 0; /* ... */

This code goes through some gyrations, counting the length of the string in C chars (this is more complex than you think when you have UTF-8 in the string). Eventually blen_chars contains the length of the string. offset is the offset value in this branch.

I've emboldened the offending code. The intent of that line's conditional is to check that the offset isn't outside of the string. Unfortunately, when the string is zero chars long, the second part of that line's conditional is false and the entire conditional fails.

Fixing the Bug

If blen_chars is 0, then any offset of one or more chars will be outside the range of the string, so the exception is necessary. (Deleting that conditional also means that using an offset of 0 with an empty string will also throw that exception. I could argue that behavior both ways.)

Vincent Pit checked in the patch for RT #67912, as well as a test case derived from Dave's code:

eval { my $buf = ''; syswrite(O, $buf, 1, 0) }; like($@, qr/^Offset outside string /);

I used Dave's command-line invocation as a quick test while creating the patch. After running the entire core test suite to verify that there were no obvious regressions, I mailed the simple patch to p5p:

--- a/pp_sys.c +++ b/pp_sys.c @@ -1919,7 +1919,7 @@ PP(pp_send) DIE(aTHX_ "Offset outside string"); } offset += blen_chars; - } else if (offset >= (IV)blen_chars && blen_chars > 0) { + } else if (offset >= (IV)blen_chars) { Safefree(tmpbuf); DIE(aTHX_ "Offset outside string"); }

This is an interesting case where deleting code can make the project's behavior more correct.

Five hours elapsed from the time of reporting to the time of the commit of the fix. In addition, Vincent also refactored the sysio.t test for clarity and ease of maintenance. This is how community-developed software should work!

It's a simple bugfix. Anyone reading this could have figured it out without too much work. Not all bugs are this simple, but the process is reasonably easy and there are plenty of people willing to help you get started improving Perl. (It's also very satisfying to think of all of the people who won't run into this problem in the future because you fixed it for them.)