Who put an extra CR in my CRLF? Fun with PerlIO layers A. Sinan Unur November 14, 2014

So, playing with my shiny new toy while waiting for remote SAS jobs to finish, I ended up trying to install PerlIO::via::gzip which pulls in PerlIO::Util. PerlIO::Util failed some of its tests, specifically, one relating to PerlIO::tee.

Some of the failures involved an extra 0d before a 0d0a

These failures were perplexing to me, but, at least CPAN testers showed me that I was not alone.

On the other hand, looking at that list, there are clearly Perl installations on Windows where this does not happen. I have been unable to figure out the underlying cause, but I suspect it is at least somewhat related to the series of puzzled posts I made about UTF-8 output from perl in a cmd.exe Window.

I went ahead, and force installed PerlIO::Util just 'cause I am lazy, and I wanted to use PerlIO::via::gzip . Here is a short script to start with. We are printing to scalars:

#!/usr/bin/env perl use 5.020; use strict; use warnings; use PerlIO::Util; open my $f, '>:tee', \(my ($x, $y)) or die "tee open: $!"; binmode $f, ':crlf' or die "binmode: $!"; print $f "

"; close $f; say hexdump($_) for $x, $y; # Thanks for the tip in the comments sub hexdump { sprintf('%*v02x', ' ', $_[0]) }

And, the output is:

t.pl 0d 0a 0d 0a

or

t.pl | xxd 0000000: 3064 2030 610d 0a30 6420 3061 0d0a 0d 0a..0d 0a..

But, if I do this:

#!/usr/bin/env perl use 5.020; use strict; use warnings; use PerlIO::Util; open my $f, '>:tee', 'x', 'y' or die "tee open: $!"; binmode $f, ':crlf' or die "binmode: $!"; print $f "abc

"; close $f;

I get:

xxd x 0000000: 6162 630d 0d0a abc... xxd y 0000000: 6162 630d 0d0a abc...

That is, it looks like the "

" above gets translated to CRLF, and then another layer translates the last LF to CRLF again.

It seems to my untrained eye that whatever is happening is probably happening within this function in PerlIO-Util.xs:

if(tab && tab->Open){ f = tab->Open(aTHX_ tab, layers, i, mode, fd, imode, perm, f, narg, args); /* apply 'upper' layers e.g. [ :unix :perlio :utf8 :creat ] ~~~~~ */ if(f && ++i < n){ if(PerlIO_apply_layera(aTHX_ f, mode, layers, i, n) != 0){ PerlIO_close(f); f = NULL; } } }

A quick inspection seems to verify this. Here is the list of layers before the application of the :crlf layer:

--- - unix - ~ - - CANWRITE - OPEN - TRUNCATE --- - crlf - ~ - - CANWRITE - FASTGETS - CRLF - TRUNCATE --- - tee - y - - CANWRITE - FASTGETS - CRLF - TRUNCATE

and, here is the list of layers after the binmode $f, ':crlf' :

--- - unix - ~ - - CANWRITE - OPEN - TRUNCATE --- - crlf - ~ - - CANWRITE - FASTGETS - CRLF - TRUNCATE --- - tee - y - - CANWRITE - FASTGETS - CRLF - TRUNCATE --- - crlf - ~ - - CANWRITE - FASTGETS - CRLF - TRUNCATE

I have demonstrated in the past that I don't necessarily understand PerlIO layers very well. But, perldoc PerlIO says:

:crlf A layer that implements DOS/Windows like CRLF line endings. On read converts pairs of CR,LF to a single "

" newline character. On write converts each "

" to a CR,LF pair. Note that this layer will silently refuse to be pushed on top of itself. (emphasis mine)

Any ideas?