Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re^3: Determining content-length for an HTTP Post

by WizardOfUz (Friar)
on Nov 25, 2009 at 17:53 UTC ( [id://809378]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Determining content-length for an HTTP Post
in thread Determining content-length for an HTTP Post

use bytes; my $length_in_bytes = length( $xmldata );

Replies are listed 'Best First'.
Re^4: Determining content-length for an HTTP Post
by ikegami (Patriarch) on Nov 25, 2009 at 18:38 UTC
    If the problem is that he forgot to encode his XML, the solution is NOT to get the length of the internal representation of the XML, it's to encode the XML.
    use Encode qw( encode ); # Or whatever encoding you specified in <?xml?> $xmldata = encode('UTF-8', $xmldata); my $length = length( $xmldata );
    or
    utf8::encode( $xmldata ); my $length = length( $xmldata );

      Huh? The bytes pragma simply forces $xmldata to be treated as a series of bytes. This should give us the correct value for the Content-Length header whether $xmldata is a character string or an UTF-8 encoded byte string. Or am I missing something?

        This should give us the correct value for the Content-Length header

        No.

        If the XML is valid, length gives the right answer without use bytes:

        $ perl -le' $_ = "<?xml version=\"1.0\"?><root>\x{C9}ric</root>"; utf8::encode($_); utf8::downgrade($_); print length; print do { use bytes; length }; ' 39 39

        You can get the wrong answer if you use use bytes;:

        $ perl -le' $_ = "<?xml version=\"1.0\"?><root>\x{C9}ric</root>"; utf8::encode($_); utf8::upgrade($_); print length; print do { use bytes; length }; ' 39 41 XXX Should be 39

        If the XML hasn't been encoded, use bytes can give you the right result if the desired encoding is UTF-8, but it's unreliable:

        $ perl -le' $_ = "<?xml version=\"1.0\"?><root>\x{C9}ric</root>"; print do { use bytes; length }; ' 38 XXX Should be 39

        In no case is use bytes; the appropriate answer.

        Perl has two different formats for storing strings. use bytes; causes opcodes to look directly at the internal buffer of the string no matter which format was used. Since Perl is free to change how it internally stores the string at will, it's quite useless to use use bytes; without taking into checking which format Perl used for that string.

        Update: Rephrased for clarity.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://809378]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (3)
As of 2024-04-26 06:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found