in reply to RE greedyness

The \d* is greedy, but as the regex engine will try its best to make a regular expression match, to satisfy the [^;]+, at least one of the digits has to be given back.

While trying out to fix your first approach using possesive matching, it became clear to me that I don't really know what your end goal is. Is it to fix just &#22 to &#22, leaving everything else alone? Then my approach below does that, by making the \d* never give anything back:

my $content1=''&#22 x'; $content1=~s/\&(\#(?>\d*)[^;]+)/\&$1/gs;

If you wanted something else, I have misunderstood you - please explain with some more examples of input and output what should happen.

Replies are listed 'Best First'.
Re^2: RE greedyness
by huck (Prior) on Nov 03, 2016 at 09:22 UTC

    Thank you, i didnt understand that part about possessive matching and giving back.

    to better explain, the html &#dd; encoding string should always terminate with the ; (as i understand it). I wanted to change any sequences of &#dd that did not terminate with the ; to &#dd. My next step was to run $content thru decode_entities from HTML::Entities then decode from JSON; seems decode_entities was ok with &#22 and then decode gave me

    JSON:invalid character encountered while parsing JSON string, at chara +cter offset 7279 (before "\x{22}