There are many options that may help solve your problem. For a start, if it is the same file every 15 minutes then you can remember (possibly in a configuration file) where you had processed up to last time and continue from that point this time - no searching required at all!
The absolute standard fix to your immediate problem is to store your payment numbers (keys) in a hash then use a very fast constant time lookup (that's what hashes do when you give them a key and ask for a value) for your match check. Consider:
use strict;
use warnings;
my @refnos = ();
my $old = <<OLD;
UTRIR8709990166
ZPHLHLKJ87
OLD
my %oldPayments;
open my $payments, '<', \$old;
%oldPayments = map {$_ => undef} grep {chomp; length} <$payments>;
close $payments;
print "Reading UTR Payment numbers \n";
while (<DATA>) {
chomp;
my @data = split (/~/, $_);
(my $utr = uc $data[1]) =~ s/\s*//g;
next if exists $oldPayments{$utr};
$oldPayments{$utr} = $data[1];
print "Payment $utr received of $data[2]\n";
}
open $payments, '>', \$old;
print $payments join "\n", sort keys %oldPayments, '';
close $payments;
print "New payments are:\n ";
print join "\n ", grep {defined $oldPayments{$_}} sort keys %oldPaym
+ents;
__DATA__
0906928472847292INR~UTRIR8709990166~ 700000~INR~20080623~RC425484~
+IFSCSEND001 ~Remiter Details ~1000007 ~TEST R
+TGS TRF7 ~ ~
+ ~ ~RTGS~REVOSN OIL CORPORATION ~IOC
+L ~09065010889~0906501088900122INR~ 7~ 1~ 1
0906472983472834HJR~UTRIN9080980866~ 1222706~INR~20080623~NI209960~
+AMEX0888888 ~FRAGNOS EXPRESS - TRS CARD S DIVIS
+I~4578962 ~/BNF/9822644928 ~
+ ~ ~ ~NEFT~REVOSN OIL
+ CORPORATION ~IO CL ~09065010889~0906501088900122INR~ 7
+~ 1~ 1
0906568946748922INR~ZP HLHLKJ87 ~ 1437865.95~INR~20080623~NI209969~HSB
+C0560002 ~MOTOSPECT UNILEVER LIMITED ~1234567
+ ~/INFO/ATTN: ~//REF 1104210 PLEASE FIND THE D
+ET ~ ~ ~NEFT~REVOSN OIL CORPORATIO
+N ~IOCL ~09065010889~0906501088900122INR~ 7~ 1~ 1
0906506749056822INR~Q08709798905745~ 5960.74~INR~20080623~NI209987~
+ ~SDV AIR LINK REVOS LIMITED ~458ss4
+53 ~ ~
+ ~ ~ ~NEFT~REVOSN OIL CORPORA
+TION ~IOCL ~09065010889~0906501088900122INR~ 7~ 1~
+ 1
0906503389054302INR~UTRI790898U0166~ 2414~INR~20080623~NI209976~
+ ~FRAGNOS EXPRESS - TRS CARD S DIVIS
+I~ ~/BNF/9826805798 ~
+ ~ ~ ~NEFT~REVOSN OIL
+ CORPORATION ~IOCL ~09065010889~0906501088900122INR~ 7~
+ 1~ 1
Prints:
Reading UTR Payment numbers
Payment UTRIN9080980866 received of 1222706
Payment Q08709798905745 received of 5960.74
Payment UTRI790898U0166 received of 2414
New payments are:
Q08709798905745
UTRI790898U0166
UTRIN9080980866
Of course I've used a variable as a file to save needing to use a disk based file for the example, but in practice you would use a disk based file of course.
However, if your data set gets very large (millions of entries perhaps) you should seriously consider using a database instead of a flat file if at all possible.
Perl reduces RSI - it saves typing
|