hmbscully has asked for the wisdom of the Perl Monks concerning the following question:

I'm having problems with newlines in the code below (which is modified code and not something I wrote new). Bascially, the problem is that there is an administrative tool (which I did not write nor know how it functions) where a user is entering a message into some kind of interface to then be sent to a list of emails. Each email will include the above message, a url, and an individual username and password. The problem is that returns are being entered into the message and they need to be for formatting and such. The problem is that when the database generates the flatfile that this script runs on, the $message field has the newlines in it and so the while loop doesn't work right because it chomps the line before getting all the way through the message, let alone to the other fields. How do I get the entire message field into $message with the newlines intact for sending?
#!/usr/bin/perl $mailprog = '/usr/sbin/sendmail -t'; $admin_email="admin\@email.org"; $subject="Testing the new mass mailer"; $file = "/actapps/suitespot/cgi-bin/sender/returntest3.txt"; $number_sent = 0; open(DATA, "$file") || die "Can't open $file"; # open the file for +reading while($line = <DATA>) { chop $line; ($junk,$email,$message,$survey_url,$login,$password) = split(/\|/, +$line); open (MAIL, "|$mailprog -t") || die "Can't open $mailprog! \n"; print MAIL "Content-type:text/plain\n"; print MAIL "From: $admin_email\n"; print MAIL "To: $email\n"; print MAIL "Subject: $subject\n"; print MAIL "$message\n\n"; print MAIL "URL: $survey_url\n"; print MAIL "Login: $login\n"; print MAIL "Password: $password\n"; close(MAIL); $number_sent++; }
I'm mostly looking for solutions under the assumption that I can't change the data before I get it, but I'll take any suggestions. I'm still new at this fun called perl, hopefully this is something simple and obvious and explains why i couldn't find any help in the scores of books strewn about. TIA. -W

Replies are listed 'Best First'.
Re: newlines and sendmail
by HyperZonk (Friar) on Aug 17, 2001 at 03:19 UTC
    Boy, does this look problematic! If there are newlines embedded in the records, then what is the record delimiter? Presumably, the answer is that when the specified pipe delimited fields have gone by, the next newline delimits the records; or more exactly, newlines within the message field are not record delimiters, but a newline outside of that field is a record delimiter. Because you don't seem to have control over the data that you are being fed, it looks like you are stuck with an extremely poorly formatted data set. This problem is actually only reducible if you promise that there won't be any pipes in the data. Promise? ;)

    The first problem, then, is that the angle bracket operator (that is, <DATA> in your example) will break the file input on each newline by default, so you will be breaking the $line input on those newlines embedded in the message section of the text. To properly parse this, your best bet is probably to slurp the entire file in one fell swoop by temporarily doing undef $/ (the record delimiter variable set to undef reads to the end of the file). Then you will have to "manually" parse the file by looking for the first newline following five pipe characters to build the records. This is not a trivial task, mind you!

    This is just intended as an overview of the method I would recommend. Given some time, I may work out a code implementation, or maybe some other monk will oblige before I get the chance.

    -HZ
Re: newlines and sendmail
by E-Bitch (Pilgrim) on Aug 17, 2001 at 02:25 UTC
    As a rule of thumb (correct me if I'm wrong here), try to use 'chomp' instead of chop, as chop merely kills the last character in the line, while chomp kills the last whitespace character (or none if the last character is alphanumeric).


    Could you post a sample file to go with this?
    Hope this helps!
    _________________________________________
    E-Bitch
    Tempora Mutantur Nos et Mutamur in Illis
    "The Times are Changed Even as We are Changed in Them"

      Your rule of thumb is very true, but that's not quite how chomp works--it checks if the input ends with $/ (the input record separator) and if so removes it. In Win32, for instance, the input record separator is "\r\n" (or perhaps more accurately "\015\012").

      You can, of course, change $/ to whatever string you want, with interesting and sometimes comical results (and possibly useful obfuscatory tricks) that are left to the reader to discover. My favorite use of it is the implicit chomp in the -l command-line flag:

      perl -lpe 'BEGIN{$/ = "\r\n"}' -i.bak filename.txt (in unix) perl -lp015 -e";" -i.bak filename.txt (in Win32, if I remember correctly)
      which changes your line endings to match your platform.



      If God had meant us to fly, he would *never* have given us the railroads.
          --Michael Flanders