in reply to How to extract a pattern in Perl regex?
G'day SergioQ,
If the data you're dealing with is HTML, then '<alt img="....">' is invalid. I suspect this is meant to be the img element which may look like:
<img alt="..." src="..."> <img src="..." alt="..."> <img src="...">
or any number of other variations including a variety of other attributes (id="...", class="...", and so on) which could appear anywhere between '<img' and '>'; it may have '/>', instead of '>', at the end.
Even if it's not HTML — perhaps it's XML — you'll likely have the same problem with an expected order. This is why you've been advised against using regular expressions for this type of work.
I strongly recommend you take a look at "Parsing HTML/XML with Regular Expressions". This expands on the issues and provides many alternatives: you'd do well to choose one of these.
— Ken
|
|---|