Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: Parsing HTML tags with regex

by gopalr (Priest)
on Nov 11, 2005 at 08:50 UTC ( [id://507666]=note: print w/replies, xml ) Need Help??


in reply to Parsing HTML tags with regex

Hi jithoosin,

Here is the regex to match the tag with attributes value.

m#<([^">]+(?:"[^"]+")*[^>]+)>#

Thanks,
Gopal.R

Replies are listed 'Best First'.
Re^2: Parsing HTML tags with regex
by Perl Mouse (Chaplain) on Nov 14, 2005 at 11:22 UTC
    But that would match on:
    a < b implies b > a
    which does not contain an HTML tag. Oh, and it won't match all HTML tags correctly either. Consider for instance:
    <tag attr1="one" attr2="two"> <tag attr='"'> <tag attr1='"'>
    The first one fails to match because your regex requires that if there are double quoted values inside a tag, they must follow each other. And the second fails because your regex doesn't consider single quoted values.
    Perl --((8:>*
Re^2: Parsing HTML tags with regex
by Anonymous Monk on Jan 19, 2012 at 09:11 UTC
    thanks gopal the above regex was usefull
Re^2: Parsing HTML tags with regex
by jithoosin (Scribe) on Nov 11, 2005 at 09:23 UTC
    Hi gopal,
    THANK YOU VERY MUCH. I won the bet .But now i am in bit of trouble. I donot know how to explain the working to my friends.So could you PLEASE explain the working of the regular expression.Once again THANK YOU VERY MUCH GOPAL.
      m# < ## start with < ( ## group start [^">]+ ## text but Not match " and > (?:"[^"]+")* ## if " found, match till end quote found. Its optional [^>]+ ## text but Not match and > ) ## group end > ## End with > #
        Hi gopal,
        There is a problem if the input string is my $line = "<select name=\"url><23\" style=\"width><A125px\"  >";
        .But i think a slight modification to your previous answer will do the job then #$line =~ m#<([^">]+(?:"[^"]+")*)*[^>]+>#; IS there any problem.Also provide me an asnwer for the question about <!--->--->thing

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://507666]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (2)
As of 2024-04-20 03:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found