in reply to How to extract a pattern in Perl regex?

G'day SergioQ,

If the data you're dealing with is HTML, then '<alt img="....">' is invalid. I suspect this is meant to be the img element which may look like:

<img alt="..." src="..."> <img src="..." alt="..."> <img src="...">

or any number of other variations including a variety of other attributes (id="...", class="...", and so on) which could appear anywhere between '<img' and '>'; it may have '/>', instead of '>', at the end.

Even if it's not HTML — perhaps it's XML — you'll likely have the same problem with an expected order. This is why you've been advised against using regular expressions for this type of work.

I strongly recommend you take a look at "Parsing HTML/XML with Regular Expressions". This expands on the issues and provides many alternatives: you'd do well to choose one of these.

— Ken