anirudhkumar_r has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,
How are you doing today? I am in a little trouble and need your help with regex.
I am replacing some strings in JavaScript. However, I believe the same regex will work for both JavaScript and Perl.

With my method of replacing, I am getting into the problem of recursive replacement.
I want to do the following.
1. Replace & with &
2. Replace < with <
3. Replace > with >
4. Replace &lt; with <
5. Replace &gt; with >

I tried to include the following code snippets in my JS code.

new_json_string = new_json_string.replace(/&amp;/g, '&'); new_json_string = new_json_string.replace(/&lt;/g, '<'); new_json_string = new_json_string.replace(/&gt;/g, '>');
But the problem with the above code snippet is, both text segments like [&lt;] and [&amp;lt;] will be replaced ultimately with [<] symbol. Is it possible to include all the three replacements in a single statement to avoid recursive replacement?
Your help is highly appreciated, Thank you very much in advance.

Replies are listed 'Best First'.
Re: Avoid recursive replacement using regex
by parv (Parson) on Nov 25, 2014 at 11:34 UTC

    You have the wrong order of replacements. Do replacement 4|5 first; then 1 (2|3 are obtained for free).

Re: Avoid recursive replacement using regex
by Athanasius (Archbishop) on Nov 25, 2014 at 12:09 UTC

    Hello anirudhkumar_r,

    Update: parv’s solution is simpler, therefore better!

    To prevent recursive replacement, you can use negative look-behind assertions as follows:

    #! perl use strict; use warnings; my $s = '&amp;lt; &lt; &amp;xyz &amp;gt;abc 123&gt;456 me&amp;you'; $s =~ s/ (?<!&amp;) &lt; /</gx; $s =~ s/ (?<!&amp;) &gt; />/gx; $s =~ s/ &amp;lt; /&lt;/gx; $s =~ s/ &amp;gt; /&gt;/gx; $s =~ s/ &amp; /&/gx; print $s;

    Output:

    22:03 >perl 1084_SoPW.pl &lt; < &xyz &gt;abc 123>456 me&you 22:03 >

    See “Look-Around Assertions” in perlre#Extended-Patterns.

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: Avoid recursive replacement using regex
by AnomalousMonk (Archbishop) on Nov 25, 2014 at 14:34 UTC

    Another way. This has the advantage of being completely independent of ordering of operations, and more replacement string mappings can easily be added to the  %replace hash.

    c:\@Work\Perl\monks>perl -wMstrict -le "my $s = '&amp; &amp;lt; &amp;gt; &lt; &gt; &amp;&lt; &amp;&amp; +'; print qq{'$s'}; ;; my %replace = ( amp => '&', lt => '<', gt => '>', ); my ($find) = map qr{ & ($_) ; }xms, join '|', keys %replace ; $s =~ s{$find}{$replace{$1}}xmsg; print qq{'$s'}; " '&amp; &amp;lt; &amp;gt; &lt; &gt; &amp;&lt; &amp;&amp;' '& &lt; &gt; < > &< &&'
    (But I'm not sure how you would translate this to Java! (Maybe ask on a Java site?))

    Update: Here's a variation that has the advantage of avoiding what I think of as 'hidden' capture groups in  qr// objects: captures that can confuse group counting when these objects are used to compose larger regexes. In the following, all captures are at the 'top' level in the  m// match, so counting's easy.

    c:\@Work\Perl\monks>perl -wMstrict -le "my $s = '&amp; &amp;lt; &amp;gt; &lt; &gt; &amp;&lt; &amp;&amp; +'; print qq{'$s'}; ;; my %replace = ( amp => '&', lt => '<', gt => '>', ); my ($entity) = map qr{ $_ }xms, join ' | ', keys %replace ; print $entity; ;; $s =~ s{ & ($entity) ; }{$replace{$1}}xmsg; print qq{'$s'}; " '&amp; &amp;lt; &amp;gt; &lt; &gt; &amp;&lt; &amp;&amp;' (?^msx: lt | gt | amp ) '& &lt; &gt; < > &< &&'

Re: Avoid recursive replacement using regex
by Loops (Curate) on Nov 25, 2014 at 12:39 UTC

    Sure, just translate this into JavaScript:

    my %k = qw(l < g > & &); print "$_ to ".s/(&)amp;|&(l|g)t;/$k{$+}/reg.$/ for qw( &amp; &amp;lt; &amp;gt; &lt; &gt; );
    Prints:
    &amp; to & &amp;lt; to &lt; &amp;gt; to &gt; &lt; to < &gt; to >
    (sorry) :P