in reply to Re: Help for regex
in thread Help for regex

Can you please explain "(^<+)"?

Replies are listed 'Best First'.
Re^3: Help for regex
by davido (Cardinal) on Apr 01, 2012 at 05:50 UTC

    Certainly. [^...] is a negated character class. If [...] allows you to enumerate what characters WILL match at a given position, [^...] allows you to say 'match any character except for these characters, at this position'.

    Negated character classes are discussed in perlretut under the heading Using character classes.

    + is a quantifier. Quantifiers are discussed in perlretut. It says to match one or more characters that meet the criteria of the preceding character class. And the (...) are capturing parenthesis. Capturing parens are discussed in perlretut. They say to capture whatever happens to match the pattern within. Since this is the first capture, it will be placed in $1

    Putting it all together: Match anything that is not '<', as many characters as possible, and capture them into $1. $1 and other capture variables are discussed in perlretut.

    Now would be a good time to follow my suggestion to read perlretut. ...you are looking to learn about regexes right? It should take about an hour or two to get the basics.


    Dave

Re^3: Help for regex
by Anonymous Monk on Apr 01, 2012 at 05:41 UTC
    The delimiters matter, so
    use YAPE::Regex::Explain; print YAPE::Regex::Explain->new( qr{<ID>([^<]+)</ID>} )->explain; __END__ The regular expression: (?-imsx:<ID>([^<]+)</ID>) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- <ID> '<ID>' ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- [^<]+ any character except: '<' (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- </ID> '</ID>' ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------