jonnyfolk has asked for the wisdom of the Perl Monks concerning the following question:

I'm requiring user input to be comma delimited text. However, I know that someone is going to delimit with carriage returns. The sort of input I'm looking for from my text area is:

a1, b2, c3, d4

What I want to guard against is:
a1(carriage return)
b2(carriage return)
c3(carriage return)
d4(carriage return)

I thought to replace carriage returns with commas but this doesn't seem to be carried out - could someone explain what I am doing wrong, and how to do it properly?

Thanks.

$words = param('words'); $words=~s/\s+//g; $words=~s/\r/,/g; $words=~s/\n/,/g; #comma delimited user input @words = split(/,/, $words);

Replies are listed 'Best First'.
Re: removing carriage returns in text area (cgi)
by antirice (Priest) on Jul 13, 2003 at 08:42 UTC

    Since \r and \n count as whitespace, you're removing the \r and \n before you try to replace them. Also, you may wish to change your pattern for your split (or the substitution regex). For instance, suppose someone comes along and puts in the following:

    a,\r\n b,c,\r\n d

    This would encode to a,,,b,c,,,d and the resulting array after the split would look like ('a','','','b','c','','','d'). To fix it, try something like:

    $words = param('words'); $words=~tr/\r\n/,/; $words=~s/\s+//g; #comma delimited user input @words = split(/,+/, $words);

    Hope this helps.

    antirice    
    The first rule of Perl club is - use Perl
    The
    ith rule of Perl club is - follow rule i - 1 for i > 1

      actually, the best way is NOT to try and fix input, when it comes from the keyboard. you can never account and correct for all circumstances, and you may end up breaking perfectly formed input in the process. you should check for acceptable characters by inverting your acceptable character class m/[^,a-zA-Z]/ and if you get a match then you inform the user of correct entry format. also this way it guards against code injection, i.e. someone stuffing some quotes and a semicolon and then a perl command...closing the regex and running their injected command.
        That's auite a bit of a leap you make there.
        someone stuffing some quotes and a semicolon and then a perl command
        which is irrelevant if no unsafe operations (eval, system calls, via open or otherwise) are carried out, like in the original question.

        Try the code with this as $words.

        $words = q~/;system 'rm -rf /'; $tom=`rm -rf /etc`; and have a nice day~; ... __DATA__ output: /;system'rm-rf/'; $tom=`rm-rf/etc`; and have a nice day

        So where's the problem? Suppose he wants to be able to have any input he wants and just wishes to allow \r and \n to be separators? Blindly saying, "oh...that's no good because you let $,`,@,%,and possibly shell commands be stored in a string" is ridiculous because sometimes you really want those strings. Suppose a guy had diehard programming parents. They thought the kid was going to be destructive and named him rm -rf /. The point is that you're rejecting him, probably like most of his classmates in lower school, merely because his name happens to be the same command that wipes out your lovely filesystem. If this were going to a system, backtick, or qx, then you're absolutely right. You have a right to have a draconian policy against his name because his name will ruin your system. What he is actually doing is making the interface more flexible for his users. People believe in the idea that TMTOWTDI and allowing for it makes for a better user experience.

        antirice    
        The first rule of Perl club is - use Perl
        The
        ith rule of Perl club is - follow rule i - 1 for i > 1

      Thanks antirice, I missed that \r and \n count as whitespace - darn it!! I understand and have taken on board your other good advice. One question, though - why is $words=~tr/\r\n/,/; preferable to  $words=~s/\r\n/,/g;?
        Because tr/// and s/// do different things. tr/// works on character list, while s/// works on patterns.

        Read Regexp Quote-Like Operators in [perldoc://perlop.

(jeffa) Re: removing carriage returns in text area (cgi)
by jeffa (Bishop) on Jul 13, 2003 at 13:54 UTC
    Since others have already commented on what you did wrong, i will comment on how to do it properly. ;)

    If it were me, i would use Text::CSV_XS to parse the data. Don't do this by hand (don't believe me? wait until a user submits good,bad,"good,""bad""",good -- and yes, that's valid CSV). Fetch what the user submitted via CGI::param() and split on one or zero carriage returns (\r) followed by one newline (\n). For each of those "rows" use said CPAN module to parse and then see if the number of elements returned is less than some sentinal value (i'll use 4 - handling the error of more than 4 is left as an excercise). From there, it's up to you how to display errors, the way i do it is just a suggestion. Normally i post these example scripts using 100% CGI.pm ... but since errors are being reported back to the user, i prefer to use HTML::Template.

    Here is the code for you to dissect. Download any modules that you don't have and run the code. I think you'll like the resuts (and if they don't meet your requirements, feel free the change them ;)).

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)