Hello Perl monks, need guidance in increasing performance of Perl script. Please see script below:
use strict; use warnings; open(OUTNOMATCH, ">nomatch.out") or die "Couldn't write to file $!"; open(OUTMATCH, ">match.out") or die "Couldn't write to file $!"; sub match_internal { #Separate internal/external addresses use Net::IP::Match::Regexp qw( create_iprange_regexp match_ip ); my $my_ip = $_[0]; my $regexp = create_iprange_regexp( qw( 192.168.0.0/16 10.10.0.0/16 192.3.3.0/23 192.168.24.0/21 10 +.0.0.0/8 ) ); if (match_ip($my_ip, $regexp)) { print OUTMATCH "$my_ip\n"; } else { print OUTNOMATCH "$my_ip\n"; } } sub uniq_ip { # locate and remove duplicate addresses my @list = @_; my @uniq_ip = keys %{{ map { $_ => 1 } @list }}; } sub sortme { # sort all addresses my @array = @_; my %hashTemp = map { $_ => 1 } @array; my @array_out = sort keys %hashTemp; } sub main_loop { #main loop that performs all logic and munging my @item; my @uniq_out; while (<>) { (my $field1, my $field2) = split /DST=/, $_; if ($field2 =~ m/(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/) { push (@item, $1); } } @uniq_out = uniq_ip(@item); foreach(@uniq_out) { match_internal($_); } } main_loop();
Here is a sample of the data that I am feeding it:
Nov 17 11:09:25 proxy02 kernel: OUTPUT LOGIN= OUT=eth0 SRC=11.11.11.0 +DST=192.168.3.1 LEN=1420 TOS=0x00 PREC=0x00 TTL=64 ID=10523 DF PROTO= +TCP SPT=3128 DPT=1921 WINDOW=16659 RES=0x00 ACK URGP=0 Nov 17 11:09:25 proxy02 kernel: OUTPUT LOGIN= OUT=eth0 SRC=11.11.11.0 +DST=192.168.3.1 LEN=1420 TOS=0x00 PREC=0x00 TTL=64 ID=10525 DF PROTO= +TCP SPT=3128 DPT=1921 WINDOW=16659 RES=0x00 ACK URGP=0 Nov 17 11:09:25 proxy02 kernel: OUTPUT LOGIN= OUT=eth0 SRC=11.11.11.0 +DST=192.168.4.1 LEN=1420 TOS=0x00 PREC=0x00 TTL=64 ID=10527 DF PROTO= +TCP SPT=3128 DPT=1921 WINDOW=16659 RES=0x00 ACK URGP=0 Nov 17 11:09:25 proxy02 kernel: OUTPUT LOGIN= OUT=eth0 SRC=11.11.11.0 +DST=192.168.43.1 LEN=1420 TOS=0x00 PREC=0x00 TTL=64 ID=10529 DF PROTO +=TCP SPT=3128 DPT=1921 WINDOW=16659 RES=0x00 ACK URGP=0 Nov 17 11:09:25 proxy02 kernel: OUTPUT LOGIN= OUT=eth0 SRC=11.11.11.0 +DST=192.168.43.1 LEN=1420 TOS=0x00 PREC=0x00 TTL=64 ID=10531 DF PROTO +=TCP SPT=3128 DPT=1921 WINDOW=16659 RES=0x00 ACK URGP=0 Nov 17 11:09:25 proxy02 kernel: OUTPUT LOGIN= OUT=eth0 SRC=11.11.11.0 +DST=192.168.43.1 LEN=1420 TOS=0x00 PREC=0x00 TTL=64 ID=10533 DF PROTO +=TCP SPT=3128 DPT=1921 WINDOW=16659 RES=0x00 ACK URGP=0 Nov 17 11:09:25 proxy02 kernel: OUTPUT LOGIN= OUT=eth0 SRC=11.11.11.0 +DST=192.168.43.1 LEN=1420 TOS=0x00 PREC=0x00 TTL=64 ID=10535 DF PROTO +=TCP SPT=3128 DPT=1921 WINDOW=16659 RES=0x00 ACK URGP=0
I am munging log files around 5.5G in size and getting out of memory errors!. Any advice on script logic would be great. Thanks

In reply to Out of memory inefficient code? by tuxtutorials

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.