Perl6 UTF-8 output in cmd.exe on Windows 8.1 A. Sinan Unur November 21, 2014

Back in May, starting with UTF-8 output from Perl and C programs in cmd.exe on Windows 8, I wrote a few posts summarizing my bewilderment with extra bytes appearing in UTF-8 output from perl when cmd.exe codepage was set to UTF-8 via chcp 65001 .

The same problem still exists with the perl 5.20.1 I built recently.

So, I decided to see what perl6, using MoarVM, can give me.

In the same cmd.exe Window, I typed:

C:\Temp> perl6 -e "Buf.new(0xce, 0xb1, 0xce, 0xb2, 0xce, 0xb3, 0x31).decode('UTF-8').say" αβγ1

or, in a script:

use v6; 'αβγ1'.say;

which gave me the output:

αβγ1

It may not seem like much, but remember that the Perl script:

use utf8; use strict; use warnings; use warnings qw(FATAL utf8); binmode STDOUT, ':utf8'; print 'αβγ1', "

";

outputs

αβγ1 1

Let's see now:

.say for ( "Hava karlı", "Bu iş kârlı", "İstanbul", "Yağ yağ yağmur"); Hava karlı Bu iş kârlı İstanbul Yağ yağ yağmur

And, then,

say "karlı" eq "kârlı"; False

which means perl6 understands the difference between snowy and profitable.

Of course, so does perl5 ... but, when it comes to printing, it has issues:

use 5.020; use utf8; binmode STDOUT, ":utf8"; print for "karlı", "kârlı";

gives the output: