in reply to Counting bytes in a Unicode document

I'm confused, do you want to count the "bytes" or the "characters" in $data?

Update
I've never used read , but it seems you are using -s to get the size in bytes but read will attempt the size in Unicode characters because of the utf8 layer. Is this really what you want?

That's kind of a "creative" way to slurp a whole file...

Anyway to answer the title's question

Remove temporarily the utf8 flag and use length then.

While I doubt that's what you want, others might stumble over this thread asking exactly this.

Cheers Rolf
(addicted to the Perl Programming Language :)
see Wikisyntax for the Monastery

  • Comment on Re: Counting bytes in a Unicode document

Replies are listed 'Best First'.
Re^2: Counting bytes in a Unicode document
by ysth (Canon) on Oct 08, 2024 at 01:11 UTC
    Remove temporarily the utf8 flag and use length then.
    You can just use bytes::length (after use bytes ();).

    --
    A math joke: r = | |csc(θ)|+|sec(θ)| |-| |csc(θ)|-|sec(θ)| |

        Both of those hacks will fail on Windows or if a decoding error occurs.