in reply to (**corrected**) What Does This Line Do?

<H1>(<[A-Za-z0-9]+>)
matches the ABCD in <H1>ABCD</H1>?

I think not. Remove the < and >, and add an H1 closing tag to clarify, ie:

<H1>([A-Za-z0-9]+)</H1>

Or if you don't mind it picking up underscores as well as letters and numbers, you can use the much more succint:

<H1>(\w+)</H1>
\w means match a word character. I suggest you do a little search on parsing html in the Super Search, or look at the HTML::Parser module, discussed here

You may want to also add modifiers to your regex, ie:

if ( m!$catreg!is ) {
the i makes it case insensitive (picks up h1 and H1), the s make the regex treat the whole string/page as one line, matching H1's created by that lovely editor, Dreamweaver, eg:
<h1>I am a heading created by Dreamweaver</h1>

<rant>
(not that Dreamweaver users ever seem to use <Hn> when <p><b><font size=6> will do instead :)
</rant>

cLive ;-)

Update: I missed that you were matching a possible tag before the match you use (see below). I strongly suggest you look at a parsing module if you don't know whether tags will contain tags or not!

Replies are listed 'Best First'.
Re: Re: (**corrected**) What Does This Line Do?
by runrig (Abbot) on Jun 07, 2001 at 00:32 UTC
    The regex also appears to be attempting to skip any tags immediately following the opening H1 tag, and not requiring a closing H1 tag (is a closing tag required for H1?).