in reply to Moving sub-strings to the start of a line
First of all, those anchors are broken. They are missing an open quote and need ether to be self closed or explicitly closed.
Second, it is a bad idea in general to try to parse HTML with regexes. HTML is not necessarily regular markup and can not be reliably parsed with a regular expression.
However, within those constraints, something like the following should do what you ask with either self closed or explicitly closed anchors.
use warnings; use strict; while (my $line = <DATA>){ my @anchors = $line =~ m#<a name="[^>]+?(?:/>|></a>)#g; $line =~ s#<a name="[^>]+?(?:/>|></a>)##g; print @anchors,$line; } __DATA__ Self: This is an anchor <a name="_Toc00123998" />and these are another +<a name="_Toc00123999" /><a name="_Toc00124000" /> couple. Explicit: This is an anchor <a name="_Toc00123998">no cap</a>and these + are another<a name="_Toc00123999"></a><a name="_Toc00124000" /></a>c +ouple.
Update: modified to not capture anchors with enclosed text.
Update2: Arggh. Try that one more time
|
|---|