in reply to Matching optionally quoted string

A simple m{<REF HREF=(.*?)\s*/>} might work here. At least it will work in the case of the "typical line" that you showed.

Of course the real clean way to do is to use a proper HTML parser, like HTML::Parser, HTML::TokeParser::Simple or HTML::TreeBuilder. HTML::TreeBuilder offers the extract_links method which will probably be just what you are looking for. It will deal with case problems, absent (or alternate) quotes... and be generally a lot more robust than what you seem to be doing here. Have a look at Sean M Burke's book Perl & LWP for more info on processing HTML with Perl.

.

Replies are listed 'Best First'.
Re: (2) Matching optionally quoted string
by CountZero (Bishop) on Dec 14, 2003 at 18:06 UTC
    No, m{<REF HREF=(.*?)\s*/>} doesn't work: it still captures the beginning and ending quotes.

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law