epoptai has asked for the wisdom of the Perl Monks concerning the following question:

Last night i solved yet another problem with Perl. It was a simple problem/solution, which i submit to you for a sanity check.

I'm writing a webmail script and saving sent mail to a pipe delimited flat file like so:

print FILE "$when|$to|$from|$subject|$body\n";
The problem was line breaks from the textarea messing up the $body part. I tried many combinations of textarea properties and such but only $body=~s/\n/ /g; got it all on one line, but that solution ruined the formatting of the original message which needs to be preserved.

My solution is to uuencode the contents of $body for saving, and uudecode $body for displaying. It seems to work very well but wonder if it's the best way or if there may be unforseen problems down the road.

my $encoded_body = pack ("u", $body); $encoded_body=~s/\n//g; print FILE "$when|$to|$from|$subject|$encoded_body\n";
Sample record:
Fri Dec 15 15:34:10 2000|to@net.net|from@net.net|Hello|E5&AI<R!I><R!T: +&4@><V%M><&QE(&UE><W-A9V4A#0H-"D)Y92X-"@`` >
Decoding:
$decoded_body = unpack ("u", $encoded_body);
Yields:
This is the sample message! Bye.
thanks - epoptai

Replies are listed 'Best First'.
Re: uuencoding to deal with line breaks
by dws (Chancellor) on Dec 16, 2000 at 02:58 UTC
    Try URL encoding the body instead of uuencoding it.

    You can use CGI::encode(), or roll your own via

      ($encoded = $body) =~ s/([^a-zA-Z0-9_.-])/uc sprintf("%%%02x",ord($1))/eg;
      Thanks dws. I like that uu can't be casually read and is a bit smaller than url encoded output. I didn't know about CGI::Encode.

      I tired ichimunki's suggestion but ended up with line breaks in the file. That's what i was trying before resorting to uu. I'm being used by windows if that makes a difference.

      IO has me stumped. I tried a few 45s short of 45k or 45megs and noticed nothing strange. But i'm hoping it never throws a | in there :-)

      45?

        $encoded_body = pack "u", $body="\n"x46 will contain "\n", If you were worried about "\n" in $body, what about "\n" in $encoded_body?
Re: uuencoding to deal with line breaks
by I0 (Priest) on Dec 16, 2000 at 02:19 UTC
    What happens when length $body > 45?
Re: uuencoding to deal with line breaks
by ichimunki (Priest) on Dec 16, 2000 at 05:02 UTC
    Based on your sample message it would appear that uuencoding the body is just going to make your flat file larger than it really wants to be. You might want to consider simply converting \n to some other control character (preferably one that won't be appearing on its own in an email) so that it doesn't interfere with using \n as a delimiter in the file.

    $body =~ s/\n/\r/sg;

    and then

    $body =~ s/\r/\n/sg;

    you might try \a, since forgetting to change this back to \n will ring the bell, which should keep people from using cat to look at the files more than once... *grin*

      Covered this one, but your post got me thinking about efficiency. 30k of text uuencodes to 40k and urlencodes to 47k. I wonder if there's a way to use pack or some other technique within perl to compress text, or at least improve on uu.

      ?

        Uuencoded output will always be larger than the input because what you're doing is mapping the 8-bit character set into a smaller one (6-bits?).

        If you want to compress your data, use Compress:Zlib (or maybe some form of `gzip -cf`) and uuencode the compressed data like before.

        If that's still too big for you (or if you like complications), you can save your data as

          $when|$to|$from|$subject|$body_length|$binary_compressed_body";
        

        and use read or sysread to read $binary_compressed_body.

        Come to think of it, if the formatting of the body is important to you, why not just save the number of lines in the body as the 5th data field rather than the body itself and have the exact body text follow it? So it would look like

        $when|$to|$from|$subject|$lines_in_body
        $body   # could be multi-lined