in reply to Re: parsing reserved chars with xml::simple
in thread parsing reserved chars with xml::simple

s/&(?!amp|quot|apos|lt|gt)/&/g; may work for you in some simple cases. This replaces ampersands that are not immediately followed by something that might be a valid tag with an escaped version.

Replies are listed 'Best First'.
Re: Re: Re: parsing reserved chars with xml::simple
by bear0053 (Hermit) on Feb 17, 2004 at 19:28 UTC
    that works for & but it won't work for <,>,',"
    regex are: $text =~ s/&(?!amp|quot|apos|lt|gt)/!38/g; $text =~ s/"(?!amp|quot|apos|lt|gt)/!34/g; $text =~ s/<(?!amp|quot|apos|lt|gt)/!40/g; $text =~ s/>(?!amp|quot|apos|lt|gt)/!41/g; $text =~ s/'(?!amp|quot|apos|lt|gt)/!39/g;
    when i run that on my xml i get:
    !40?xml version=!341.0!34 encoding=!34UTF-8!34?!41 !40TRANSACTION!41 !40FIELDS!41 !40FIELD KEY=!34user!34!41name!40/FIELD!41 !40FIELD KEY=!34password!34!41pass!38word!40/FIELD!41 !40FIELD KEY=!34operation_type!34!41!40do_what!41!40/FIELD!41 !40/FIELDS!41 !40/TRANSACTION!41
    so i am still in the same spot i need to be able to only manipulate the data in between > <

    thanks again for your help
      What is !41? If you are going to escape the characters, you should use the right escape values. You can either use the entities: &, >, <, ", or the numeric character references: &, >, <, ".