in reply to Re^2: Dealing with large files in Perl
in thread Dealing with large files in Perl
But now that you have provided more information about your data -- that the value you want to match is the first token on each data line, and this consists of a long hex number -- you can speed things up and make it more trustworthy by using "substr" and "eq" instead of a regex match:
use strict; my $Usage = "Usage: $0 value file1 file2\n"; die $Usage unless ( @ARGV == 3 and -f $ARGV[1] and -f $ARGV[2] ); my $value = shift; # removes first element from @ARGV my $chklen = length( $value ); my @match; # will hold matching line from each file for my $file ( @ARGV ) { # loop over remaining two ARG's open( IN, $file ) or die "$file: $!"; while (<IN>) { if ( substr( $_, 0, $chklen ) eq $value ) { chomp; push @match, $_; last; } } close IN; # (this was implicit in the earlier version) } print join( " ", @match ), "\n";
Note that in either version, if the value you provide on the command line turns out to be shorter than the initial hex number on each line of the input files, there's a chance that you'll get a "false alarm" match.
For example, in the initial regex version, if the search value on the command line was just "6b" or "00", this could explain why the record from the second file was not right -- "6b" and "00" are found in both records.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Dealing with large files in Perl
by tester786 (Initiate) on May 17, 2005 at 20:02 UTC | |
|
finding highest and lowest number
by tester786 (Initiate) on May 23, 2005 at 23:24 UTC | |
by jZed (Prior) on May 23, 2005 at 23:33 UTC | |
by tester786 (Initiate) on May 26, 2005 at 06:12 UTC |