pacman has asked for the wisdom of the Perl Monks concerning the following question:

morning all!

Description: Suppose that the variable $line holds a line from some file.The '|' delimited line looks like this ..

$line = man|woman|child|family

If I wanted to seperate the fields into an array, I'd do
something like this

my @split_line = split(m"\|",$line)
so far so good..right?

Problem:What If my line in $line, for some reason, didn't look
like it was supposed to(!).So I just checked for the
presence of the delimiter '|' in the line.

if(!$line =~ m"|"g){die " Not Compatible\n";}
Just wonder if this is going to be robust enough ?
Any nicer way of doing this??

Edited by Chady -- fixed code tags.

20040323 Edit by BazB: Changed title from 'simple one'

Replies are listed 'Best First'.
Re: Input validation
by castaway (Parson) on Mar 23, 2004 at 08:09 UTC
    That depends on how you are expecting the line to look. Are there supposed to be a fixed number of fields, for example? In which case, you could check the size of the array returned by split, it will only have one entry if there were no '|' in $line. You could further check the individual entries, if you are expecting them to be strings, or numbers, in particular fields.

    The 'not equals' regex, should be '!~', by the way, as '!$line =~' would do something different, I think. (Compare the negated $line, and not negate the result of the =~)

    C.

Re: Input validation
by matija (Priest) on Mar 23, 2004 at 08:06 UTC
    Just wonder if this is going to be robust enough ?
    Depends on what you need it for. (Oh, and you forgot to escape the | in your code: it should be m"\|").

    If you just need to check for the presence of A separator, your code (as ammended) would be sufficient.

    However, you might want to check other things, such as the number of fields you get after the split, if the fields contain the data you expected (you do have a plan on how to handle empty fields, don't you?) etc.

      hi thanks Matija,Castaway & CountZero ..

      The line should comprise the said number of fields(in
      the eg case - 4).So what I am also checking for,is the
      number of fields returned on the 'split' & yeah I am
      taking care of the possibility of receiving an empty
      $line and/or empty field/fields. I think the two checks
      1.Fields returned & 2. delimiter presence should be good
      enough method to determine if the $line is fine,right?

        Sounds like you are concerned that you may have bad data, and asking us how bad it could be, which of course we have no way of telling.

        If you want to check that split returns the number of fields you expect, just check it:

        my @split_line = split(m"\|",$line); if (@split_line != 4) { warn "something gang aft"; } else { print "valid data!"; }
Re: Input validation
by CountZero (Bishop) on Mar 23, 2004 at 08:15 UTC
    Indeed much better to check the result of the split operation, than to use an extra regex to check for "|". If you do need a separate check a test based on index is probably less resource intensive and faster. I didn't test it, but no doubt someone will put together a benchmark shortly.

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law