in reply to reg expression needs a tweek

The character immediately preceeding could be anything

If that were true, how could anyone find out where the real string begins? Surely some characters are not allowed there, probably numbers and/or a-z

If you have problems installing XML::Simple, maybe try XML:Twig or some other XML Parser