in reply to Re: perl quicker than bash?
in thread perl quicker than bash?

Yes you can totally do that in Perl and with much better performance. Heres the gist of how it can be done:
use strict; use warnings; my ( %masked, @results ); while ( my $line = <STDIN> ) { my ($ip) = $line =~ / (\d+ \. \d+ \. \d+ \. \d+) /x or next; my $mask = pack 'C3', split /\./, $ip; if ( $masked{$mask} ) { if ( not $masked{$mask}{repeat} ) { $masked{$mask}{ip} =~ s{ \d+ \z }{0/24}x; $masked{$mask}{repeat} = 1; } } else { $masked{$mask} = { ip => $ip, }; push @results, $masked{$mask}; } } print $_->{ip}, "\n" for @results;
output:
1.2.3.0/24 1.4.3.5 2.3.1.2 2.3.2.0/24
This way you also don't need to depend on the ordering of the source file (works with any order). As for dealing with comments and different cidrs I'll "leave it as an exercise for the reader" :)

Replies are listed 'Best First'.
Re^3: perl quicker than bash?
by Anonymous Monk on Jan 08, 2015 at 02:35 UTC
    Okay, that was a different anonymonk explaining pack and recommending Modern Perl, but now it's me again :) I had nothing better to do and decided to actually write this thing for you, seeing your enthusiasm (to give you a taste of Perl). Here it goes. The program accepts command line options '-i' and '-o', meaning input and output. Otherwise it operates on stdin and stdout. Usage:
    $ perl squeeze_ips.pl -i input.list -o output.list
    It preserves biggest found cidr and the first encountered comment (changing it to the last found comment is easy enough). Processing a file of one million ips takes about 15 seconds on my laptop.
      elsif ( $old_cidr > $new_cidr ) {
      should be
      elsif ( $new_cidr and $old_cidr > $new_cidr ) {
      come to think of it.
      It preserves biggest found cidr
      s/biggest/smallest/ :)

        Awesome! Thanks very much for your hard work! :)

        I'll give it a go. This all helps me to learn the syntax.

        Yes - it's hard to tell one Anon Monk from the next - you all look alike to me! (haha)

        :)

Re^3: perl quicker than bash?
by TiffanyButterfly (Novice) on Jan 07, 2015 at 19:56 UTC

    Wow! That is so much shorter than what I wrote... lol

    I've been reading it through and re-reading... parts of the code I can understand but others are completely new to me (pack, push, my). The bits that I can understand also show how I really could have written that bash script better. :)

    I'll keep researching the bits that I don't understand and let the learning begin!

    Thank you for your great solution!

      my just defines a lexical variable in the current block. push is hopefully easy to understand, pack can get a little complicated - in this case it's turning the first three octets of the IP address into a byte string; so e.g. pack "C3", split /\./, "80.114.108.33" returns the bytes/chars "Prl" (the last octet is ignored because the template is "C3" and not "C4").

      In addition to the general documentation such as perlsyn and perldata (Syntax and Data Types), to understand that code perldsc (Data Structures) will probably be useful; an introduction to the references used to create those data structures is in perlreftut. And a regular expression tutorial is at perlretut.

      See also Modern Perl 2014 (free online edition), Learning Perl, and some of the links on this site: Getting Started with Perl

        Thanks Anonymous Monk, your instructions are well received! Time to grab a coffee and start reading...

        btw: updated my script with an idea from your script. Reduced execution time by 1 second (post has been updated) :)