vineet2004 has asked for the wisdom of the Perl Monks concerning the following question:

hi to all monks....and also to lesser mortals

i am using the below 'if' statement to process a text.

while (<>) { if (/ (\w\w) (\w\w) (\w\w) (\w\w) /) { #do smthing } 00 12 0d 90

the above text is part of a larger log file which i havent shown here. i am not wrong the 'which(<>)' command does line by line processing. so i intend to assign each pair of digits a variable like $1,$2 etc ...which i can later manipulate in my code. another such line might contain 3 pairs instead of 4 and yet another might contain 2 pairs.but i am only able to use the line having 4 pairs like the one shown above and none of the other lines having pairs less than 4.

what can i do to be able to include the line with less than 4 pairs also.

awaiting ur replies
thanks
vineet

Replies are listed 'Best First'.
Re: pattern matching
by liverpole (Monsignor) on Dec 25, 2006 at 13:54 UTC
    Hi vineet2004,

    One quick way is to use the "?" modifier like so:

    while (<>) { # Only require a non-null first capture if (/(\w\w)(\w\w)?(\w\w)?(\w\w)?/) { # Do something } }

    In each of the captures, the "?" says that the item is optional.  Therefore, the match will succeed if only the first capture succeeds.

    Note that if you use warnings, you will still need to test each of the captures for null when you use them, to avoid getting uninitialized value warnings.

    Update:  bart's comment below about the whitespace is a good one.  I think my solution will still work if you modify it slightly:

    if (/(\s*\w\w)(\s+\w\w)?(\s+\w\w)?(\s+\w\w)?/) { # Do something }

    Come to think of it, though, that just makes the whitespace part of the match, so ++bart, as his solution looks like a better one to do what you need.


    s''(q.S:$/9=(T1';s;(..)(..);$..=substr+crypt($1,$2),2,3;eg;print$..$/
      I think liverpole is close, but he's ignoring one important factor: the data contains a space between the data the OP wants to capture, and thus, isn't part of it. So just making the patterns do optional matching, won't cut it.

      Here's what I would do:

      if (/(\w\w)(?: (\w\w)(?: (\w\w)(?: (\w\w))?)?)?/) { # Do something }

      I don't like the idea of making the parallel items optional, instead, I nest them, so $3 cannot match if $2 didn't match, as both are part of the same optional pattern. Ditto with $4, that can only match if both $3 and $2 matched.

        thanks guys....
        thanks bart for ur reply ur code is fine....but actually my problem of pattern matching includes 31(or less) instead of the 4(or less) pairs that as i mentioned in my post. i made the change so as to be able to give an example easily. if i follow ur code pattern then it might turn out to be a bit too long......any shorter version??? thanks again for ur help so far vineet
Re: pattern matching
by throop (Chaplain) on Dec 26, 2006 at 04:14 UTC
    Use the g regex modifier to capture all the pairs. Count them and see how many you have. Then decide. Use the non-capturing parenthesis (?: ) for the space. This still leaves a lot of questions - e.g. what to with character triples?
    while (<>){ my @charpairs = /(?: \s+ (\w\w) ) /gx; if($charpairs != 4){ print "Funny looking at line $. :\n\t'$_'\n"} else{ #do smthing }}
    throop