lmocsi has asked for the wisdom of the Perl Monks concerning the following question:
Hi,
I'd like to shorten every decimal number in a file, keeping it's structure. How can I do that?
use strict;
my $content = '{ "geometry": { "type": "Polygon", "coordinates": [ [ [
+ 19.054804912278406, 47.485785556135411 ], [ 19.057857836771483, 47.4
+87322542030711 ], [ 19.06025597925397, 47.488491565765209 ], [ 19.060
+347248086835, 47.488539642204628 ], [ 19.060463310421543, 47.48457828
+7406251 ], [ 19.054804912278406, 47.485785556135411 ] ] ] } }';
while ($content =~ /(\d\d\.\d*)/g){
my $num = substr($1,0,9);
# number is shortened, but how do I use it?
}
Re: In-place regex substitution
by toolic (Bishop) on Dec 09, 2016 at 21:39 UTC
|
If you want to change the string in place, you can use s///ge:
use warnings;
use strict;
my $content = '{ "geometry": { "type": "Polygon", "coordinates": [ [ [
+ 19.054804912278406, 47.485785556135411 ], [ 19.057857836771483, 47.4
+87322542030711 ], [ 19.06025597925397, 47.488491565765209 ], [ 19.060
+347248086835, 47.488539642204628 ], [ 19.060463310421543, 47.48457828
+7406251 ], [ 19.054804912278406, 47.485785556135411 ] ] ] } }';
$content =~ s/(\d\d\.\d*)/substr($1,0,9)/ge;
print "$content\n";
__END__
{ "geometry": { "type": "Polygon", "coordinates": [ [ [ 19.054804, 47.
+485785 ], [ 19.057857, 47.487322 ], [ 19.060255, 47.488491 ], [ 19.06
+0347, 47.488539 ], [ 19.060463, 47.484578 ], [ 19.054804, 47.485785 ]
+ ] ] } }
| [reply] [d/l] |
|
c:\@Work\Perl\monks>perl -wMstrict -le
"use 5.010;
;;
my $content
= qq/{ \"geometry\": { \"type\": \"Polygon\", \"coordinates\": [ [
+[ 19.05480
4912278406, \n/
. qq/47.485785556135411 ], [ 19.057857836771483, 47.487322542030711
+ ], [ 19.06025597925397, \n/
. qq/47.488491565765209 ], [ 19.060347248086835, 47.488539642204628
+ ], [ 19.060463310421543, \n/
. qq/47.484578287406251 ], [ 19.054804912278406, 47.485785556135411
+ ] ] ] } } \n/
;
print qq{[[$content]]};
;;
my $n = 7;
$content =~ s{ \d [.] \d{$n} \K \d+ }{}xmsg;
;;
print qq{<<$content>>};
"
[[{ "geometry": { "type": "Polygon", "coordinates": [ [ [ 19.054804912
+278406,
47.485785556135411 ], [ 19.057857836771483, 47.487322542030711 ], [ 19
+.06025597925397,
47.488491565765209 ], [ 19.060347248086835, 47.488539642204628 ], [ 19
+.060463310421543,
47.484578287406251 ], [ 19.054804912278406, 47.485785556135411 ] ] ] }
+ }
]]
<<{ "geometry": { "type": "Polygon", "coordinates": [ [ [ 19.0548049,
47.4857855 ], [ 19.0578578, 47.4873225 ], [ 19.0602559,
47.4884915 ], [ 19.0603472, 47.4885396 ], [ 19.0604633,
47.4845782 ], [ 19.0548049, 47.4857855 ] ] ] } }
>>
Give a man a fish: <%-{-{-{-<
| [reply] [d/l] [select] |
|
| [reply] |
|
| [reply] |
Re: In-place regex substitution
by Marshall (Canon) on Dec 10, 2016 at 07:54 UTC
|
To demo one idea from LanX, with the 'e', execute option on the regex, arbitrary code can be executed. I coded the idea of using sprintf(). The code below will round to 6 decimal places instead of truncating at 6 decimal places.
Example:
47.488539642204628
truncates to: 47.488539
rounds to: 47.488540
use warnings;
use strict;
my $content = '{ "geometry": { "type": "Polygon", "coordinates": [ [ [
+ 19.054804912278406, 47.485785556135411 ], [ 19.057857836771483, 47.4
+87322542030711 ], [ 19.06025597925397, 47.488491565765209 ], [ 19.060
+347248086835, 47.488539642204628 ], [ 19.060463310421543, 47.48457828
+7406251 ], [ 19.054804912278406, 47.485785556135411 ] ] ] } }';
$content =~ s/(\d\d\.\d*)/sprintf("%.6f",$1);/ge; #round to 6 decimal
+digits
print "$content\n";
__END__
Truncated:
{ "geometry": { "type": "Polygon", "coordinates": [ [ [ 19.054804, 47.
+485785 ], [ 19.057857, 47.487322 ], [ 19.060255, 47.488491 ], [ 19.06
+0347, 47.488539 ], [ 19.060463, 47.484578 ], [ 19.054804, 47.485785 ]
+ ] ] } }
Rounded:
{ "geometry": { "type": "Polygon", "coordinates": [ [ [ 19.054805, 47.
+485786 ], [ 19.057858, 47.487323 ], [ 19.060256, 47.488492 ], [ 19.06
+0347, 47.488540 ], [ 19.060463, 47.484578 ], [ 19.054805, 47.485786 ]
+ ] ] } }
| [reply] [d/l] [select] |
|
$content =~ s/(\d\d\.\d*)/sprintf("%.6f",$1);/ge; #round to 6 decimal
+digits
One can avoid hard-coding the rounding precision by using either
my $n = $some_integer;
... sprintf("%.${n}f", $1) ...
or the "wildcard" (if that's the right term (update: maybe "placeholder"?)) format specifier
... sprintf('%.*f', $n, $1) ...
Update: In the code above, I've implied that $n must be an integer. Interestingly (for some definition of "interesting"), if it is not, then the first code example ($n interpolated) screws up, but the second (* wildcard specifier) behaves "properly", DWIMishly ignoring the fractional part of the numeric value. (I haven't checked yet to see if this behavior is the same as in C/C++/D/etc.)
Give a man a fish: <%-{-{-{-<
| [reply] [d/l] [select] |
|
Thanks, but a minor nitpick. :)
> s/(\d\d\.\d*)/
this will exclude numbers smaller than ten (already so in marto's post)
Supposing all numbers have a decimal point, I'd say s/(\d+\.\d*)/
It's really hard to guess which formats are possible and where they'll appear.
In the end the safest way is use a JSON parser.
| [reply] [d/l] [select] |
|
|