While new to perlmonks im not new to perl, having used even bigperl on win95. Yet i still find it hard to understand regular expressions sometimes. below is a case that i would like some input about;

my $content1=''&#22 x'; $content1=~s/\&(\#\d*[^;\d]+)/\&$1/gs; print '1:'.$content1."\n"; my $content2=''&#22 x'; $content2=~s/\&(\#\d*[^;]+)/\&$1/gs; print '2:'.$content2."\n";
Result
1:'&#22 x 2:'&#22 x

I first wrote the content2 code, expecting \d* to be greedy, and was surprised to find it was not greedy enough. My fix is the content1 code, and i am happy that it works and i figured it out. But i still cant understand why the \d* in content2 was not greedy enough to capture all of the '39', instead taking the '9' to be NOT ';' . Can someone enlighten me please?

This is perl 5, version 20, subversion 1 (v5.20.1) built for MSWin32-x +86-multi-thread-64int (with 1 registered patch, see perl -V for more detail) Copyright 1987-2014, Larry Wall Binary build 2000 [298557] provided by ActiveState http://www.ActiveSt +ate.com Built Oct 15 2014 22:10:49

Now in all reality the &#22 shouldn't be there, this is from text that has been html encoded by the yahoo groups api and returned within a JSON content. the user did expect to see &#22 on the page and for some reason yahoo did not encode the & to & im just trying to work around it.

Thank you for any explanations


In reply to RE greedyness by huck

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.