Re: handling erronous input

What i need is for my code to only accept input of 1 digit before the decimal place and 5 after , a space, a comma.. then an 8 character time using : as a seperator that will ignore any data on the same line after the seconds.

You seem to know what you want, so it's just a case of following your spec...

my @data = split /\n/, <<EOS;
1.57163 ,17:29:57 Simple Dealin
1.57163 ,17:29:57
1.57163 ,17:29:57
1.57163 ,17:29:57
1.57163 ,17:29:57
1.57163 ,17:29:57
1.57163 ,17:29:57
1.571 ,17:
1.57172 ,17:30:08
1.57176 ,17:30:10
EOS

for ( @data ) {
    if ( my ( $quote, $time, $comment ) =
         m{
              ^                # start of string
              (\d\.\d{5})      # a digit, dot and 5 more digits
              \s,              # a space and a comma
              (\d\d:\d\d:\d\d) # an 8 character time
              \s*              # some spaces
              (.*)             # everything else's a commment
              $                # end of string
      }x ) {
        print "$quote / $time",
          $comment ne '' ? " / $comment" : '',
            "\n";
    }
}
[download]

Output:

1.57163 / 17:29:57 / Simple Dealin
1.57163 / 17:29:57
1.57163 / 17:29:57
1.57163 / 17:29:57
1.57163 / 17:29:57
1.57163 / 17:29:57
1.57163 / 17:29:57
1.57172 / 17:30:08
1.57176 / 17:30:10
[download]

Update:

See perlre and perlretut for the details

Comment on Re: handling erronous input Select or Download Code

Replies are listed 'Best First'.
Re^2: handling erronous input by Conal (Beadle) on Apr 06, 2008 at 22:23 UTC
Hi and thanks FunkyMonk for the reply.. i do get a little lost tho here as regards where i am opening my file and feeding it into @data , can you expound on that for me please? (sorry for being the noob) fwiw, based on your well explained pattern matching sequences above, i have also created a possible revised unless statement `open(DATAFILE, "$input") \|\| die("Can't open $input:!\n"); while (<DATAFILE>) { unless (m{^(\d\.d{5})\s,(\d\d:\d\d:\d\d)\s*}) { next; } chomp $_; ($quote,$time) = split(",", $_); chop($quote); #remove a white space ($hour,$minute,$second) = split(":",$time);) # more processing` [download] How does that look? although i do like the way you have done things.. because of an internet outage here the last 4 hours , i was unable to do any testing of new code and had to bring up my old buggy code live @ 5pm E.S.T .. id really like to able to just drop in a new unless statement into the existing code , if thats at all possible? The script eventually updates a mysql database and and creates webpage.. so its not straightforward testing the code out of a live situation so i want to keep revisions to a minimum. p.s i realise that i may be dismissing some of the conventions of working with floating point numbers which may be a little unsettling to some, but for this project i am sure that the 'shortcuts' i am taking are safe. I have my code working fine in a live environment for 2 weeks now. The only issue i have is this bug when dealing with unexpected input data formats in my input files. p.p.s sorry for being so verbose here. conal.	[reply] [d/l]
Re^3: handling erronous input by FunkyMonk (Bishop) on Apr 06, 2008 at 22:57 UTC
You've missed part of the regexp out (the bit that captures comments) and missed a backslash out (from \d{5}). It looks like you don't know that, in a regexp, parentheses capture their matches into $1, $2, $3 etc. Again, see perlretut and perlre for the details. Your code is similar to mine. You use `while ( ... ) { unless ( some-condition ) { next } some-code }` [download] while I prefer the equivalent `while ( ... ) { if ( some-condition ) { some-code } }` [download] it's just that (IMHO) yours is harder to read (and longer, too) That said, you can use my code with a filehandle like so (I've rearranged it a bit to use unless and made the regexp a more lenient towards spaces)... `while ( <DATAFILE> ) { chomp; unless ( m{^ (\d\.\d{5}) \s,\s (\d\d:\d\d:\d\d) \s* (.*) $ }x ) +{ next } my ( $quote, $time, $comment ) = ( $1, $2, $3 ); # captures my ( $hours, $minutes, $seconds ) = split /:/, $time; #do something with $quote, $hours, $minutes, $seconds & $comment }` [download]	[reply] [d/l] [select]
Re^4: handling erronous input by GrandFather (Saint) on Apr 06, 2008 at 23:35 UTC
I prefer: `while ( ... ) { next if some-condition; some-code }` [download] which not only saves a couple of (trivial I admit) lines of code, it reduces clutter and avoids an extra level of nesting for `some-code`. Perl is environmentally friendly - it saves trees	[reply] [d/l] [select]
Re^4: handling erronous input by Conal (Beadle) on Apr 06, 2008 at 23:50 UTC
gotcha.. i understand completely now. thanks again, and its works fine, i just wanted to be certain before i messed anything up.. fwiw here is a link to my project --> http://fxr.freehostia.com/pam/pam_alpha_5.php and sorry for my wanton abuse of floating point numbers. ;p conal.	[reply]
Re^3: handling erronous input by ww (Archbishop) on Apr 07, 2008 at 02:55 UTC
In addition to the problems with your first regex, `($hour,$minute,$second) = split(":",$time);)` should be `($hour,$minute,$second) = split /:/,$time;` The pattern in `split` is a regex and needs slashes (or other unambiguous matched punctuation), not quotes. Note also that the last closing paren in your split is "one too many" (and thus, "wrong) and all the parens on the RHS are unnecessary. Subject to your taste, note that your extraction to `$quote` and `$time`could be written `next unless ( $data =~ /^(\d\.\d{5})\s,(\d\d:\d\d:\d\d)./ ); $quote = $1; $time=$2;` [download] Update:* s/not/note/ in the last narrative paragraph.	[reply] [d/l] [select]
Re^4: handling erronous input by apl (Monsignor) on Apr 07, 2008 at 12:06 UTC
Or `next unless ( $data =~ /^(\d\.\d{5})\s,(\d{2}:\d{2}:\d{2}).*/ ); $quote = $1; $time=$2;` [download]	[reply] [d/l]