Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

regex on a line

by Sara (Acolyte)
on Aug 01, 2002 at 15:44 UTC ( [id://186807]=perlquestion: print w/replies, xml ) Need Help??

Sara has asked for the wisdom of the Perl Monks concerning the following question:

Hello guys , I have this data here
Update ID: pa Update Status: open (view: pa_lden-cr_q00457655_EFLT) Branch Type: lden Comment: <* Software Update Template *> <*Version: dflkasd
I want to grep for the update status and see if it is open if it is then I want to print the path next to it , in this case my output should be the
$view = pa_lden-cr_q00457655_EFLT
I am doing the following and able to check weather it is new ,, but to get the view there is a problem in my code:
while(<check>) { if (m/^(?:Update Status:)\s*(\w+)/) { $target = $1 } if ( $target eq "open") { if (m/^(?:Update Status:)\s*(\w+)\s*(\w+)$/) { $view = $1 } } print "$target\n"; print "$veiw\n"; close(check);
can someone show me the problem ,, thanks guys .

Replies are listed 'Best First'.
Re: regex on a line
by arturo (Vicar) on Aug 01, 2002 at 16:28 UTC

    First: literal parens need to be escaped, or they'll be seen as grouping operators. Your input has literal parentheses in it, so you need to take note of that.

    Second: Since you can capture more than one match in a regular expression, you need to pay attention to which match you want to grab. When, in your second pattern match, you set $view to $1, you're nabbing what's in the first parentheses, which in this case is what $target is already set to. You either want to get rid of the parentheses around that first \w+ so it doesn't capture, or use $2 to capture what you want. Of course you need to make sure you're taking account of the first point I made above.

    Third: note that the minus sign is not a word character, so given your example and what you say you want to capture, you should be using a slightly more complicated regex than (\w+). You need to match word characters and the minus sign, so the natural choice is a character class : [\w\-] (the - must be escaped because it has special meaning inside a character class).

    Fourth: you're printing both target and view outside of your conditional. That's not likely to lead to sensible output, since those things are only set if your regular expressions match.

    Fifth, and I'll leave this as an exercise for you, there's a more efficient way to do this than match things twice. Use the power of list assignment with your regex matches and capturing parentheses, here's a sample of how that works:

    $_ ="I have a lovely bunch of bananas"; if ( my ($verb, $adjective, $fruit) = /^I (\w+) a (\w+) \w+ of (\w+)/ +) { print "It would seem your $fruit, which you describe as '$adjectiv +e', is something you $verb.\n"; }

    The inner block (the "if") here is only executed if the pattern matches, so if you put something like that inside your loop, it will be quite efficient, since you only try to match once instead of twice.

    HTH

Re: regex on a line
by Tomte (Priest) on Aug 01, 2002 at 16:07 UTC
    Second regexp:
    if(m/\((?:.*): (.*)\)/){$view=$1}

    should be sufficient, you know that this is an "Update Status"-line with $target eq "open" when you attepmt the second match.

    regards,
    tomte
Re: regex on a line
by Basilides (Friar) on Aug 01, 2002 at 16:40 UTC
    There're a quiet a few bugs in this, and I think you'd really benefit from putting use strict at the top of your programs.

    Anyway, you're missing a curly bracket, and you've spelt "view" wrongly in your print statement, so you'd never get a proper result.

    Also, the string you're trying to match: "(view: pa_Iden etc etc)" has got spaces in it, so you'll never succeed in your match if you use the non-whitespace character, \w. You could substitute that for "." which matches anything, although I have heard monks say that ".*" is bad practice (I'm still quite a novice tho', so I don't know why they say this!).

    Finally, when you assign to $view, you mean $2, not $1. You could do with taking out some of those brackets, which are confusing you with backreferences.

    Here's a version which works:

    while(<check>) { if (m/^(?:Update Status:)\s*(\w+)/) { $target = $1; } if ($target eq "open") { if (m/^(?:Update Status:)\s*(\w+)\s*(.*)$/) { $view = $2; } print "$target\n"; print "$view\n"; close(check); } }
    However, here's a slightly more concise version which only uses one regex:
    while(<check>) { if (m/^(?:Update Status:)\s*(\w+)\s*(.*)$/) { if ($1 eq "open") { $view = $2; print "$view\n"; close(check); } } }
    Note that in each of these, as soon as an "Update: open" line is found, the file is closed. I don't know if that's what you want, but if not, you'll have to move that close(check); line.

    HTH
    Dennis

Re: regex on a line
by Nightblade (Beadle) on Aug 01, 2002 at 15:58 UTC
    print "$1\n$2\n" if($line =~ m/^Update Status: (\w+) \(view: (.+)\)$/ +);
Re: regex on a line
by fglock (Vicar) on Aug 01, 2002 at 16:02 UTC

    You have an "(" before "view", that is not matched by \w

Re: regex on a line
by CukiMnstr (Deacon) on Aug 01, 2002 at 16:18 UTC
    hm. let's see: i don't think you need the non-backreferencing parentheses around 'Update Status'. also, when you are looking for $view, you get $1 again, and that is still 'open'.

    you might get what you want with a simple split(), but maybe you have those '\s*' because sometimes you don't have whitespace between the fields...

    you could use only one regex, and with it capture both fields:

    m/^Update Status:\s*?(\w+)\s*?\(\w+:?\s*?(\w+)\)/

    and then:

    ($target, $view) = ($1, $2);

    hope this helps,

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://186807]
Approved by TStanley
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (6)
As of 2024-04-18 06:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found