Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

In-place regex substitution

by lmocsi (Novice)
on Dec 09, 2016 at 21:29 UTC ( [id://1177570]=perlquestion: print w/replies, xml ) Need Help??

lmocsi has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I'd like to shorten every decimal number in a file, keeping it's structure. How can I do that?
use strict; my $content = '{ "geometry": { "type": "Polygon", "coordinates": [ [ [ + 19.054804912278406, 47.485785556135411 ], [ 19.057857836771483, 47.4 +87322542030711 ], [ 19.06025597925397, 47.488491565765209 ], [ 19.060 +347248086835, 47.488539642204628 ], [ 19.060463310421543, 47.48457828 +7406251 ], [ 19.054804912278406, 47.485785556135411 ] ] ] } }'; while ($content =~ /(\d\d\.\d*)/g){ my $num = substr($1,0,9); # number is shortened, but how do I use it? }

Replies are listed 'Best First'.
Re: In-place regex substitution
by toolic (Bishop) on Dec 09, 2016 at 21:39 UTC
    If you want to change the string in place, you can use s///ge:
    use warnings; use strict; my $content = '{ "geometry": { "type": "Polygon", "coordinates": [ [ [ + 19.054804912278406, 47.485785556135411 ], [ 19.057857836771483, 47.4 +87322542030711 ], [ 19.06025597925397, 47.488491565765209 ], [ 19.060 +347248086835, 47.488539642204628 ], [ 19.060463310421543, 47.48457828 +7406251 ], [ 19.054804912278406, 47.485785556135411 ] ] ] } }'; $content =~ s/(\d\d\.\d*)/substr($1,0,9)/ge; print "$content\n"; __END__ { "geometry": { "type": "Polygon", "coordinates": [ [ [ 19.054804, 47. +485785 ], [ 19.057857, 47.487322 ], [ 19.060255, 47.488491 ], [ 19.06 +0347, 47.488539 ], [ 19.060463, 47.484578 ], [ 19.054804, 47.485785 ] + ] ] } }

      Or without explicit capturing or evaluation (or rounding!) (update: and with dynamic control of truncation threshold), but needs Perl version 5.10+ for  \K operator:

      c:\@Work\Perl\monks>perl -wMstrict -le "use 5.010; ;; my $content = qq/{ \"geometry\": { \"type\": \"Polygon\", \"coordinates\": [ [ +[ 19.05480 4912278406, \n/ . qq/47.485785556135411 ], [ 19.057857836771483, 47.487322542030711 + ], [ 19.06025597925397, \n/ . qq/47.488491565765209 ], [ 19.060347248086835, 47.488539642204628 + ], [ 19.060463310421543, \n/ . qq/47.484578287406251 ], [ 19.054804912278406, 47.485785556135411 + ] ] ] } } \n/ ; print qq{[[$content]]}; ;; my $n = 7; $content =~ s{ \d [.] \d{$n} \K \d+ }{}xmsg; ;; print qq{<<$content>>}; " [[{ "geometry": { "type": "Polygon", "coordinates": [ [ [ 19.054804912 +278406, 47.485785556135411 ], [ 19.057857836771483, 47.487322542030711 ], [ 19 +.06025597925397, 47.488491565765209 ], [ 19.060347248086835, 47.488539642204628 ], [ 19 +.060463310421543, 47.484578287406251 ], [ 19.054804912278406, 47.485785556135411 ] ] ] } + } ]] <<{ "geometry": { "type": "Polygon", "coordinates": [ [ [ 19.0548049, 47.4857855 ], [ 19.0578578, 47.4873225 ], [ 19.0602559, 47.4884915 ], [ 19.0603472, 47.4885396 ], [ 19.0604633, 47.4845782 ], [ 19.0548049, 47.4857855 ] ] ] } } >>


      Give a man a fish:  <%-{-{-{-<

      Great, thx! :)
        Are you sure that a number can't have more than 9 digits before the point?

        You might prefer sprintf instead of substr in order to just shorten the digits after the point.

        Cheers Rolf
        (addicted to the Perl Programming Language and ☆☆☆☆ :)
        Je suis Charlie!

Re: In-place regex substitution
by Marshall (Canon) on Dec 10, 2016 at 07:54 UTC
    To demo one idea from LanX, with the 'e', execute option on the regex, arbitrary code can be executed. I coded the idea of using sprintf(). The code below will round to 6 decimal places instead of truncating at 6 decimal places.

    Example:

    47.488539642204628 truncates to: 47.488539 rounds to: 47.488540
    use warnings; use strict; my $content = '{ "geometry": { "type": "Polygon", "coordinates": [ [ [ + 19.054804912278406, 47.485785556135411 ], [ 19.057857836771483, 47.4 +87322542030711 ], [ 19.06025597925397, 47.488491565765209 ], [ 19.060 +347248086835, 47.488539642204628 ], [ 19.060463310421543, 47.48457828 +7406251 ], [ 19.054804912278406, 47.485785556135411 ] ] ] } }'; $content =~ s/(\d\d\.\d*)/sprintf("%.6f",$1);/ge; #round to 6 decimal +digits print "$content\n"; __END__ Truncated: { "geometry": { "type": "Polygon", "coordinates": [ [ [ 19.054804, 47. +485785 ], [ 19.057857, 47.487322 ], [ 19.060255, 47.488491 ], [ 19.06 +0347, 47.488539 ], [ 19.060463, 47.484578 ], [ 19.054804, 47.485785 ] + ] ] } } Rounded: { "geometry": { "type": "Polygon", "coordinates": [ [ [ 19.054805, 47. +485786 ], [ 19.057858, 47.487323 ], [ 19.060256, 47.488492 ], [ 19.06 +0347, 47.488540 ], [ 19.060463, 47.484578 ], [ 19.054805, 47.485786 ] + ] ] } }
      $content =~ s/(\d\d\.\d*)/sprintf("%.6f",$1);/ge; #round to 6 decimal +digits

      One can avoid hard-coding the rounding precision by using either
          my $n = $some_integer;
          ... sprintf("%.${n}f", $1) ...
      or the "wildcard" (if that's the right term (update: maybe "placeholder"?)) format specifier
          ... sprintf('%.*f', $n, $1) ...

      Update: In the code above, I've implied that  $n must be an integer. Interestingly (for some definition of "interesting"), if it is not, then the first code example ($n interpolated) screws up, but the second (* wildcard specifier) behaves "properly", DWIMishly ignoring the fractional part of the numeric value. (I haven't checked yet to see if this behavior is the same as in C/C++/D/etc.)


      Give a man a fish:  <%-{-{-{-<

      Thanks, but a minor nitpick. :)

      > s/(\d\d\.\d*)/

      this will exclude numbers smaller than ten (already so in marto's post)

      Supposing all numbers have a decimal point, I'd say s/(\d+\.\d*)/

      It's really hard to guess which formats are possible and where they'll appear.

      In the end the safest way is use a JSON parser.

      Cheers Rolf
      (addicted to the Perl Programming Language and ☆☆☆☆ :)
      Je suis Charlie!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1177570]
Approved by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (8)
As of 2024-04-19 08:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found