Your current parser is dying because it is trying to sew jeans with a knitting needle.

Regexen are not the right tool for parsing languages for which the meaning of a token depends heavily on context or is part of a recursively nested pair. XML has both those features. Your regex isn't working anymore because it is having difficulty determining the context of the greater than and less than signs it is trying to replace.

I will grant you that you can get a regex based parser to work for controlled set of input, but no matter how hard you try it will be fragile. And the more you try to make it work, the more difficult it will be to explain and maintain those regexen.

Rewriting something that you trust is never fun, especially if you are fighting deadlines, but the problems you are having signal that you have outgrown the capacity of your old tools and it is time to move on to better tools.

Best, beth


In reply to Re^3: Regular expression to replace xml data by ELISHEVA
in thread Regular expression to replace xml data by dalegribble

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.