in reply to How does the built-in function length work?

length doesn't know anything about encodings. It counts the characters in the string, whether those characters happen to be bytes, Unicode code points or something entirely different.

If you pass encoded text to it (bytes), it will count the bytes.

If you pass decoded text to it (Unicode code points), it will count the Unicode code points.

Bytes "\xC9\x72\x69\x63" String: C9 72 69 63 Length: 4 Unicode code points "\N{LATIN CAPITAL LETTER E WITH ACUTE}ric" String: C9 72 69 63 Length: 4 Unicode code points "\N{LATIN CAPITAL LETTER E WITH ACUTE}ric\N{RIGHT SINGLE QUOTATION MAR +K}s" String: C9 72 69 63 2019 73 Length: 6

There are many ways of creating each of the above strings. I just listed one as an example. It doesn't matter how the string is created.

Replies are listed 'Best First'.
Re^2: How does the built-in function length work?
by PerlOnTheWay (Monk) on Feb 10, 2012 at 01:37 UTC

    I tried :

    print length("\N{LATIN CAPITAL LETTER E WITH ACUTE}ric");

    and it's reporting syntax error.

      You need something like this before you can use named Unicode characters:

      use charnames ':full';

      Improve your skills with Modern Perl: the free book.

      By syntax error, do you mean the following?

      Constant(\N{LATIN CAPITAL LETTER E WITH ACUTE}ric) unknown: (possibly +a missing "use charnames ...") at - line 1, within string Execution of - aborted due to compilation errors.

      If so, like the message says, it's because you need to add use charnames ':full';. If not, could you be more specific? Maybe your version of Perl predates \N{}?

      PS — charnames will be loaded automatically when needed in 5.16.