huck has asked for the wisdom of the Perl Monks concerning the following question:
While new to perlmonks im not new to perl, having used even bigperl on win95. Yet i still find it hard to understand regular expressions sometimes. below is a case that i would like some input about;
Resultmy $content1='' x'; $content1=~s/\&(\#\d*[^;\d]+)/\&$1/gs; print '1:'.$content1."\n"; my $content2='' x'; $content2=~s/\&(\#\d*[^;]+)/\&$1/gs; print '2:'.$content2."\n";
1:'&#22 x 2:'&#22 x
I first wrote the content2 code, expecting \d* to be greedy, and was surprised to find it was not greedy enough. My fix is the content1 code, and i am happy that it works and i figured it out. But i still cant understand why the \d* in content2 was not greedy enough to capture all of the '39', instead taking the '9' to be NOT ';' . Can someone enlighten me please?
This is perl 5, version 20, subversion 1 (v5.20.1) built for MSWin32-x +86-multi-thread-64int (with 1 registered patch, see perl -V for more detail) Copyright 1987-2014, Larry Wall Binary build 2000 [298557] provided by ActiveState http://www.ActiveSt +ate.com Built Oct 15 2014 22:10:49
Now in all reality the  shouldn't be there, this is from text that has been html encoded by the yahoo groups api and returned within a JSON content. the user did expect to see  on the page and for some reason yahoo did not encode the & to & im just trying to work around it.
Thank you for any explanations
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: RE greediness
by Athanasius (Archbishop) on Nov 03, 2016 at 09:02 UTC | |
|
Re: RE greedyness
by Corion (Patriarch) on Nov 03, 2016 at 08:56 UTC | |
by huck (Prior) on Nov 03, 2016 at 09:22 UTC |