mldvx4 has asked for the wisdom of the Perl Monks concerning the following question:

I'd like to generate a span of Unicode characters. I am clearly missing some understanding about working with Unicode. The following produced no output except for a newline:

perl -Mutf8 -e 'binmode(STDOUT, ":utf8"); $a=join("", "\x{DF}" .. "\x{0101}"); print "$a\n";'

That style worked fine with the ASCII range, join("", "A" .. "Z"); but maybe I was doing it wrong there too. So what is a correct way to generate a string consisting of a span of Unicode characters?

Replies are listed 'Best First'.
Re: Generating a range of Unicode characters
by davido (Cardinal) on Nov 16, 2017 at 05:46 UTC

    Check out perlop Auto-increment and Auto-decrement for an explanation.

    The thing to consider here is that the .. range operator leverages the semantics provided by ++ (auto-increment). The documentation for auto-increment says this:

    The auto-increment operator has a little extra builtin magic to it. If you increment a variable that is numeric, or that has ever been used in a numeric context, you get a normal increment. If, however, the variable has been used in only string contexts since it was set, and has a value that is not the empty string and matches the pattern /^a-zA-Z*0-9*\z/ , the increment is done as a string, preserving each character within its range, with carry:

    print ++($foo = "99"); # prints "100" print ++($foo = "a0"); # prints "a1" print ++($foo = "Az"); # prints "Ba" print ++($foo = "zz"); # prints "aaa"

    The components of the range you are trying to construct do not meet the criteria for Perl's built-in autoincrement behavior.

    However, if you're using Perl 5.26 or newer, and enable unicode_strings you can use the following, as documented in perlop Range Operators.

    use charnames "greek"; my @greek_small = map { chr } (ord("\N{alpha}") .. ord("\N{omega}"));

    Or forgo the \N{charname} lookups and just use the actual ordinal values:

    my @chars = map {chr} $ord_first .. $ord_last;

    Dave

Re: Generating a range of Unicode characters
by Your Mother (Archbishop) on Nov 16, 2017 at 06:13 UTC

    Is this what you're after?

    perl -CSD -le 'print chr for 0xDF .. 0x0101'

    Update: I hadn't read all the way down davido's post. He is making the same suggestion already at the end.

      You have been upvoted: I took way too long to get to the point. ;)


      Dave

        Thanks, both of you. The following produces the output I am expecting.

        perl -e '$a=join("",map ({chr} 0xdf .. 0x0101)); print "$a\n";'

        I guess there is no way to do the Unicode equivalent of 'A' .. 'Z' instead.

        Is there some way to skip the map() or chr() functions?

        :P My short attention span cannot excuse me. You covered all the bases quite nicely, as usual. I was reading on my phone though, if that is a mitigating factor.

Re: Generating a range of Unicode characters
by Anonymous Monk on Nov 16, 2017 at 09:50 UTC