in reply to Regular Expression

See this node: Regular Expressions for almost the exact same question.
Why can't you use modules? The most robust way will be something like HTML::Parser -- look specifically at the examples section for extracting the <title> tag.

for one-time quick & dirty, use a regex (this assumes, of course, that there isn't a > in the onLoad javascript):
if( $html =~ /<body (.*?)>/si ){ my $body_attributes = $1; }
Maybe something like this will help guard against javascript screwing up the match, but assumes proper quoting of the attributes:
/<body((?:\s+(?:\w+=".*?"))*)>/si
Update: added strike and bold after reading/noting ikegami's response

Replies are listed 'Best First'.
Re^2: Regular Expression
by ikegami (Patriarch) on Jun 28, 2005 at 21:38 UTC
    I was directed to the second regexp this post as a solution that fixes problems in another post, but it's no better.
    but assumes proper quoting of the attributes:

    The HTML spec allows for single quotes, and even allows for the quotes to be omitted in some circumstances, so no, it doesn't assume proper quoting.

    Also, it doesn't handle > inside of quotes (where it doesn't need to be escaped).

    Finally, it could locate <body> inside of a comment or inside of another attribute.