in reply to removing carriage returns in text area (cgi)

Since \r and \n count as whitespace, you're removing the \r and \n before you try to replace them. Also, you may wish to change your pattern for your split (or the substitution regex). For instance, suppose someone comes along and puts in the following:

a,\r\n b,c,\r\n d

This would encode to a,,,b,c,,,d and the resulting array after the split would look like ('a','','','b','c','','','d'). To fix it, try something like:

$words = param('words'); $words=~tr/\r\n/,/; $words=~s/\s+//g; #comma delimited user input @words = split(/,+/, $words);

Hope this helps.

antirice    
The first rule of Perl club is - use Perl
The
ith rule of Perl club is - follow rule i - 1 for i > 1

Replies are listed 'Best First'.
Re: Re: removing carriage returns in text area (cgi)
by aquarium (Curate) on Jul 13, 2003 at 11:40 UTC
    actually, the best way is NOT to try and fix input, when it comes from the keyboard. you can never account and correct for all circumstances, and you may end up breaking perfectly formed input in the process. you should check for acceptable characters by inverting your acceptable character class m/[^,a-zA-Z]/ and if you get a match then you inform the user of correct entry format. also this way it guards against code injection, i.e. someone stuffing some quotes and a semicolon and then a perl command...closing the regex and running their injected command.
      That's auite a bit of a leap you make there.
      someone stuffing some quotes and a semicolon and then a perl command
      which is irrelevant if no unsafe operations (eval, system calls, via open or otherwise) are carried out, like in the original question.
        Yes, it's a leap alright...we only have a little snippet of code from the script. does the code perform SQL or system calls, who knows. This sub-thread of the main idea is still relevant: it guards against code injection. And the main idea (reject bad input alltogether) is better than a fix the input approach. People that input quotes and other funny characters into input boxes (cgi) are generally up to no good anyway. When was the last time you entered quotes in a cgi form? With the "fix the input" approach, the shortcoming of the program will be found sooner rather than later...and if it's not code injection that happens, than at least it will possibly break the code. Therefore "reject all but GOOD input" should be an idiom that coders (coders != software designers) might like to learn. I'm not getting on a pedestal either...but I would like to impart the few good rules I have learnt. It's not all just about the code or to see how far you can get with a regex.

      Try the code with this as $words.

      $words = q~/;system 'rm -rf /'; $tom=`rm -rf /etc`; and have a nice day~; ... __DATA__ output: /;system'rm-rf/'; $tom=`rm-rf/etc`; and have a nice day

      So where's the problem? Suppose he wants to be able to have any input he wants and just wishes to allow \r and \n to be separators? Blindly saying, "oh...that's no good because you let $,`,@,%,and possibly shell commands be stored in a string" is ridiculous because sometimes you really want those strings. Suppose a guy had diehard programming parents. They thought the kid was going to be destructive and named him rm -rf /. The point is that you're rejecting him, probably like most of his classmates in lower school, merely because his name happens to be the same command that wipes out your lovely filesystem. If this were going to a system, backtick, or qx, then you're absolutely right. You have a right to have a draconian policy against his name because his name will ruin your system. What he is actually doing is making the interface more flexible for his users. People believe in the idea that TMTOWTDI and allowing for it makes for a better user experience.

      antirice    
      The first rule of Perl club is - use Perl
      The
      ith rule of Perl club is - follow rule i - 1 for i > 1

Re: Re: removing carriage returns in text area (cgi)
by jonnyfolk (Vicar) on Jul 13, 2003 at 09:44 UTC
    Thanks antirice, I missed that \r and \n count as whitespace - darn it!! I understand and have taken on board your other good advice. One question, though - why is $words=~tr/\r\n/,/; preferable to  $words=~s/\r\n/,/g;?
      Because tr/// and s/// do different things. tr/// works on character list, while s/// works on patterns.

      Read Regexp Quote-Like Operators in [perldoc://perlop.