Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

perl -nle 'print if (!/([0-9]?\d\d?|2[0-4]\d|25[0-5])\.([0-9]?\d\d?|2[0-4]\d|25[0-5])\.([0-9]?\d\d?|2[0-4]\d|25[0-5])\.([0-9]?\d\d?|2[0-4]\d|25[0-5])/);' filename

Hi, I am trying to tweak this code for unix so that I can hide the numbers in the IP address to x's: i.e 194.66.82.11 but only replace so that the IP is formatted like xxx.xx.xx.11 instead of just removing it completely. How would I do this.

Replies are listed 'Best First'.
Re: Filtering out IP addresses
by atcroft (Abbot) on Jul 22, 2014 at 22:14 UTC

    This appears to do what you expect, although it replaces all octets but the last with '###' (rather than based on the number of digits in the octet):

    s/(([0-2]?\d{1,2}\.){3})([0-2]?\d{1,2})/###.###.###.$3/g;

    Code to test regex:

    $ perl -le 'foreach my $c ( q{123.45.67.89 123.45.67.98}, qw/ 0.0.0.0 +10.1.2.3 172.001.002.003 192.168.128.1 255.255.255.255 / ) { my $d = +$c; $d =~ s/(([0-2]?\d{1,2}\.){3})([0-2]?\d{1,2})/###.###.###.$3/g; p +rint $c, q{ -> }, $d; }'

    Test output:

    123.45.67.89 123.45.67.98 -> ###.###.###.89 ###.###.###.98 0.0.0.0 -> ###.###.###.0 10.1.2.3 -> ###.###.###.3 172.001.002.003 -> ###.###.###.003 192.168.128.1 -> ###.###.###.1 255.255.255.255 -> ###.###.###.255

    Hope that helps.

Re: Filtering out IP addresses
by 1s44c (Scribe) on Jul 22, 2014 at 23:27 UTC

    It might be fun to write regular expressions but it's also reinventing the wheel. Regular expressions for IPs can be found in Regexp::Common.

    perl -MRegexp::Common='net' -n -e 'm/$RE{net}{IPv4}{-keep}/ && print "xxx.xx.xx.$5\n"'
      Great idea. Let's generalize to respect the length of each field and make the substitutions in the original text.
      use strict; use warnings; use Regexp::Common qw /net/; my $string ='194.66.82.11'; $string =~ s/$RE{net}{IPv4}{-keep} /'x'x(length $2) . '.' . 'x'x(length $3) . '.' . 'x'x(length $4) . '.' . $5 /xge; print $string;
      Bill

        Great idea. Let's generalize to respect the length of each field and make the substitutions in the original text.

        Not matching the original field lengths may actually be a feature here - you'd be providing extra information if you did, which may or may not be what you want.

Re: Filtering out IP addresses
by AppleFritter (Vicar) on Jul 22, 2014 at 21:33 UTC

    This'll do it:

    $ perl -nle 's/([0-9]?\d\d?|2[0-4]\d|25[0-5])\.([0-9]?\d\d?|2[0-4]\d|2 +5[0-5])\.([0-9]?\d\d?|2[0-4]\d|25[0-5])\.([0-9]?\d\d?|2[0-4]\d|25[0-5 +])/xxx.xxx.xxx.$4/; print' filename

    Depending on what your file contains, this can be simplified a lot, though. If it's e.g. one IP address per line, the following will also be fine:

    $ perl -nle 's/\d+\./xxx./g; print' filename

    But in the presence of other data, it may or may not work as intended, and also note it neither cares about IP addresses having four fields, nor about each field ranging from 0 to 255, or in fact consisting of no more than three decimal digits.

    When in doubt, I'd suggest using the first one, even if it's a bit unwieldy.

      perl -nle 's/\d+\./xxx./g; print' filename
      As long we're writing one-liners, that should probably be
      perl -ple 's/\d+\./xxx./g' filename
      See -p in perlrun.

      #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

        Oh, nice, I'd not been aware of -p yet. Thanks for the pointer!
      thank you, that was a really helpful answer. We need People like you who take time of their day to help a stranger. Perl monk is the best!
        Awww shucks, sugarcube. *tips hat* You're very welcome!
Re: Filtering out IP addresses
by AnomalousMonk (Archbishop) on Jul 22, 2014 at 23:26 UTC

    Another approach. Identifies 'real' decimal octet IP addresses. Matches  'x' replacements to actual octet digits if that's really what you want to do.

    c:\@Work\Perl\monks>perl -wMstrict -le "use Regexp::Common qw(net); ;; for my $s ( '1.2.3.4', '1.22.111.222', 'a1.2.3.4a b1.22.111.221b', '999.9.9.9', '9.9.9.999', ) { printf qq{'$s' -> }; (my $t = $s) =~ s{ (?<! \d) ($RE{net}{IPv4}) (?! \d) } { (my $ip = $1) =~ s{ \d (?= \d* [.]) }{x}xmsg; $ip; }xmsge; print qq{'$t'}; } " '1.2.3.4' -> 'x.x.x.4' '1.22.111.222' -> 'x.xx.xxx.222' 'a1.2.3.4a b1.22.111.221b' -> 'ax.x.x.4a bx.xx.xxx.221b' '999.9.9.9' -> '999.9.9.9' '9.9.9.999' -> '9.9.9.999'

    Note that if you're using Perl 5.14+, the  /r substitution modifier makes the expression a bit simpler.

    c:\@Work\Perl\monks>perl -wMstrict -le "use 5.014; ;; use Regexp::Common qw(net); ;; for my $s ( '1.2.3.4', '1.22.111.222', 'a111.22.3.4a b1.22.111.221b', '999.9.9.9', '9.9.9.999', ) { printf qq{'$s' -> }; my $t = $s =~ s{ (?<! \d) ($RE{net}{IPv4}) (?! \d) } { $1 =~ s{ \d (?= \d* [.]) }{x}xmsgr }xmsger; print qq{'$t'}; } " '1.2.3.4' -> 'x.x.x.4' '1.22.111.222' -> 'x.xx.xxx.222' 'a111.22.3.4a b1.22.111.221b' -> 'axxx.xx.x.4a bx.xx.xxx.221b' '999.9.9.9' -> '999.9.9.9' '9.9.9.999' -> '9.9.9.999'

Re: Filtering out IP addresses
by kennethk (Abbot) on Jul 22, 2014 at 22:15 UTC
    In general, I agree with AppleFritter's comment; a regular expression is developed according to the environment in which it must operate. With regard to the code you've posted, I wouldn't think
    ([0-9]?\d\d?|2[0-4]\d|25[0-5])
    would be a good way to describe an IP address because [0-9] is (barring Unicode) equivalent to \d. Therefore, your first term swallows your following two. You probably want something more like
    (1?\d\d?|2[0-4]\d|25[0-5])
    but this still allows 00, which you may or may not care about.

    So how about

    perl -nle 'print if s/((1?\d\d?|2[0-4]\d|25[0-5])\.){3}(?=\d+)/xxx.xxx +.xxx./g'
    It misses spec on match x-counts, but you should probably be scrubbing that too, if you are scrubbing.

    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

Re: Filtering out IP addresses
by wjw (Priest) on Jul 23, 2014 at 08:34 UTC

    This seems to work as well, though obviously not a 1 liner..

    #!/usr/bin/perl use Modern::Perl qw(2014); while (<DATA>) { my @octets = split('\.',$_); chomp @octets; for (0..2) { $octets[$_] =~ s/\d/x/g; } say join(".", @octets); } __DATA__ 1.2.3.4 192.168.0.1 255.255.255.128 23.65.98.101
    Outputs:
    x.x.x.4 xxx.xxx.x.1 xxx.xxx.xxx.128 xx.xx.xx.101
    I suppose for further obfuscation one could just stick 3 x's in each of the first three octets and append the last octet of the IP to that string, thus hiding the format...

    ...just a thought...

    ...the majority is always wrong, and always the last to know about it...

    Insanity: Doing the same thing over and over again and expecting different results...

    A solution is nothing more than a clearly stated problem...otherwise, the problem is not a problem, it is a facct