Re: Should this be so slow ?

Yes, it should be slow; you're reading 180 MB files 3000 or so times. Is there some ordering relationship that you can exploit so that you only have to read the 180MB files once? I can guess that the files would be ordered by date+time. If that's the case, you could do something like this (obviously untested):

#!/usr/bin/perl

use warnings; use strict;

$[ = 1;
$\ = "\n";

my ($last_xdrfile,$xline);
open CDR_DETAILS, "cdr_details.1" or die "couldn't open cdr_details.1:
+ $!\n";
open CDR_TDM, ">cdr_tdm.csv" or die "couldn't write cdr_tdm.csv: $!\n"
+;
while (<CDR_DETAILS>) {
    chomp;
    my ($cdrdate,$cdrtime,$cdrcli1,$cdrcli2,$xdrfile) = (split /,/)[1,
+2,3,7,8];
    $cdrtime += 0;
    print "Record number: $.";
    print "CDRDATE: $cdrdate";
    print "CDRTIME: $cdrtime";
    print "CLI1: $cdrcli1";
    print "CLI2: $cdrcli2";
    if ($last_xdrfile ne $xdrfile) {
       print "$xdrfile != $last_xdrfile, opening file!";
       open(XDR,$xdrfile) or die "couldn't open $xdrfile: $!\n";
       chomp($xline = <XDR>);
    }
    $last_xdrfile = $xdrfile;
    while (defined $xline) {
        my ($xdrdate,$xdrtime,$xdrcli1,$xdrcli2)=(split /,/, $xline)[2
+,3,5,7];
        last if $xdrdate gt $cdrdate;
        last if $xdrtime gt $cdrtime;
        next unless $xdrdate eq $cdrdate && $xdrtime eq $cdrtime &&
                    $xdrcli1 eq $cdrcli1 && $xdrcli1 eq $cdrcli2;

        print CDR_TDM $xline;
    } continue { chomp($xline = <XDR>) }
}
[download]

I'm making some gross assumptions about the formats of the date and time though. If they're not in a format that's directly comparable, you'll need to convert to a form that is (possibly using something like Date::Parse)

What my version does is rather than read the entire 180MB file into memory, it reads it a line at a time assuming both this file and the one generating the "searches" are in date+time order. There are a couple of gyrations to get the XDRFILE to read properly:

we do a priming read as soon as we open the file
subsequent reads happen in a continue block so that $xline has the last line read the next time we process a line from the CDRFILE.

duff

Comment on Re: Should this be so slow ? Select or Download Code