cspctec has asked for the wisdom of the Perl Monks concerning the following question:

I have data that looks like this:
<ID="user_one" event="open(2)"> <ID="user_one" event="system booted"> <ID="user_one" event="init(1m)">

I need a regex that will return whatever is in the quotes of the event="open(2)", but I need the matching to stop if it hits a parenthesis and not stop if it hits a space. Example return of the regex would be:

open system booted init
I don't need the stuff in the parenthesis. I have come up with this:
match =~ /event="(.+?)"/
I don't know how to get it to stop at the parenthesis but not stop if it hits a space.

Replies are listed 'Best First'.
Re: Help with this simple regex
by toolic (Bishop) on May 28, 2013 at 19:41 UTC
    This gives you the output you're looking for:
    use warnings; use strict; while (my $match = <DATA>) { $match =~ /event="([^(]+).*"/; print "$1\n"; } __DATA__ <ID="user_one" event="open(2)"> <ID="user_one" event="system booted"> <ID="user_one" event="init(1m)">

      I modified your regular expression pattern slightly to avoid matching beyond the closing quote.

      use warnings; use strict; while (my $line = <DATA>) { while ($line =~ m/event="([^("]+)[^"]*"/gi) { print "$1\n"; } } __DATA__ <ID="user_one" event="open(2)"> <ID="user_one" event="system booted"> <ID="user_one" event="system booted"><ID="user_one" EVENT="init(1m)"> <ID="user_one" event="init(1m)"> <ID="" event="">
Re: Help with this simple regex
by davido (Cardinal) on May 28, 2013 at 19:46 UTC

    my $re = qr/ \b event \s* = \s* " # Match word boundary, "event", # optional space, "=", optional # space, and a double-quote. ( [^"(]+ ) # Capture any quantity of anything # that is not a double-quote or an # open paren. ["(] # Anchor to another double-quote or # an opening paren. /x; # Extended pattern (literal whitespace # and comments aren't significant). if( $match =~ $re ) { print "Captured: [[$1]]]]"; }

    Dave

Re: Help with this simple regex
by Anonymous Monk on May 28, 2013 at 21:24 UTC
      Ummm...I thought illegible and unmaintainable code was the whole point of coding in RegEx.

      Obfuscation, the aura of inscrutable genius, and job security :)

      Dyslexics Untie !!!