avik has asked for the wisdom of the Perl Monks concerning the following question:

I'm very much confused about usage of lookaround assertions. What i have is this:
my $str = '<SELECT STATE> <OPTION VALUE="AZ">Arizona</OPTION> <OPTION VALUE="GA" SELECTED>Georgia</OPTION> <OPTION VALUE="FL">Florida</OPTION> </SELECT>';
How do I capture "GA" only in this case? Many thanks for helps.

Replies are listed 'Best First'.
Re: lookaround assertions
by ccn (Vicar) on Apr 01, 2004 at 07:28 UTC
    my ($ga) = $str =~ /VALUE="([^"]+)" SELECTED/;

      That regex needs to be made case insensitive, matching the underlying HTML. Note also that the regex still assumes that the VALUE attribute and SELECTED property appear in the given sequence, that the white space is strictly as expected, and that double quotes are used.

      These kind of assumptions are only going to be valid if you are absolutely sure of the format of the HTML, otherwise you will need to parse properly.

        All these complications... I was just thinking of doing
        my ($ga) = ($str =~ /(\bGA\b)/)
        Of course, while this solves the question in the post, it may not do what the author wants ;).
Re: lookaround assertions
by nmcfarl (Pilgrim) on Apr 01, 2004 at 18:13 UTC
    If this isn't a one off I'd really look into HTML::PARSER . It makes retrieving tag attributes quite simple. And it's designed to do this job. Of course it could be overkill, but I tend to think the Module use makes for more readable code than regex abuse.
      no, actually this particular project required quite an elegant regex. i ended up not mixing together regexes for both input and select tags, but I wrote routines for select tags to run separately. so far so good. thank you for all suggestions!