in reply to Re^4: Cannot access HTTP::Response content properly
in thread Cannot access HTTP::Response content properly

  • Comment on Re^5: Cannot access HTTP::Response content properly

Replies are listed 'Best First'.
Re^6: Cannot access HTTP::Response content properly
by Anonymous Monk on Nov 03, 2009 at 16:21 UTC

    Ok, so then I need to encode my output to ascii text somehow. You'll have to excuse my ignorance here as this is totally uncharted territory for me. So would something like the following work?

    use Encode qw(encode); . . . my $decodedContent = encode("ascii", $res->content);

    Where $decodedContent is the text that I am looking for?

    Is there a difference if I use HTTP::Message-> encode($encoding) instead? I'm not entirely sure of the man pages on this.

      There are two kinds of encoding at play here: Transfer Encoding and Character Encoding.

      Transfer encoding allows the content to be compressed during transit among other things. You want the actual content, not the temporary version used for transit, which is why you want to use ->decoded_content. ->content returns the representation of the content during transit, something that's useless on its own.

      Character encoding is what allows characters to be represented as bytes. For example, the character encoding US-ASCII associates byte 0x41 with character LATIN CAPITAL LETTER A. The same character is associated with bytes 00 41 using encoding UTF-16be.

      In files, characters can only appear in their encoded form. Internally, the same is true for memory, but Perl allows you to work with characters directly instead of the underlying bytes that form it.

      That means that you can decode 00 41 to A, but you need to encode A back to into bytes if you want to save it to disk. (print expects bytes, not characters, unless you told it what to do with characters by using binmode :encoding.)

      ->decoded_content will also decode the character encoding for you if the web server specifies the content is some kind of text (incl HTML and XML) and specifies its character encoding. That can actually be bad, so you can disable that feature by specifying charset => 'none',


      You seem to have assumed that the being encoded using UTF-16be is a problem. It's not necessarily, and trying to "fix" it could actually break it. For example, if the file is an XML document, it's not safe to change it's encoding since the document's encoding is specified in the document. The same is often the case for HTML documents as well.

      Some background info on what you are trying to do would be useful.

        Here is the background info you requested. I am basically trying to call a Perl script from within a C program. However, I don't want to just make a system() call. I want to be able to capture the output (stdout) from the Perl script. If I run the Perl script on it's own, I get the output displayed to stdout. If I call it from the C program I do not. At first I thought this was a problem with my C program, but I am pretty sure it is not. Why? Because if I assign a random string (aka "Hello World") to a variable in that same Perl script and execute a print command from inside it, THEN my C program is able to capture the output from stdout. But if I use the "print $res->decode_content" command it does not. Just for clarification, here is a portion of the C program:

        char command[] = "/home/user/scripts/PerlScript.pl"; . . fp = popen(command, "r"); buffer = (char *)malloc(sizeof(char) * bufSize); while( fgets(buffer, bufSize, fp) != NULL) fputs(buffer, stdout); pclose(fp); free(buffer);

        The above program calls the Perl script I have written that is in question. Again, if I do the following it prints when called from the C program:

        my $randomVar = "Hello World\n"; print $randomVar;

        But if I do this, it will NOT print from the C program (unless I call the Perl script by itself from the command line):

        my $res = $ua->get('http://SOMEURLHERE'); print $res->decoded_content;