in reply to Making $ Unicode-aware

A better question is what would we gain by doing this? How is \R different from \n? Are there other linebreaks meaningfully different than ASCII LF, or is the Unicode committee just wasting codepoints again?

Replies are listed 'Best First'.
Re^2: Making $ Unicode-aware
by jo37 (Curate) on Jul 27, 2020 at 06:02 UTC

    From perlrebackslash:

    \R is equivalent to (?>\x0D\x0A|\v)

    Greetings,
    -jo

    $gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$

      Is that really intended to only match CRLF or should it be (?>\x0D?\x0A|\v) to also match traditional *nix line endings? (There is still a problem with (?>\x0D?\x0A|\v) — it does not match the traditional CR-only Macintosh line ending.) Why is vertical tab included?

        \v is not the vertical tab. It matches the character class of "vertical whitespace". The characters belonging to this class are listed in perlrecharclass as:

        LINE FEED LINE TABULATION FORM FEED CARRIAGE RETURN NEXT LINE LINE SEPARATOR PARAGRAPH SEPARATOR
        So actually \R matches all single vertical space characters and the two character sequence CR LF. This includes all common line endings.

        Greetings,
        -jo

        $gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$