in reply to Match a list of numbers from a list of files

Firstly, you can read entire an entire transaction by setting the input record separator ($/, see perlvar). Second, you can use a regular expression with alternation to see if a transaction contains any of the numbers in one fell swoop.

use strict; use warnings; open my $numberFH, q{<}, \ <<EOD or die qq{open: $!\n}; 123456789 567898760 154216722 763498126 EOD chomp( my @numbers = <$numberFH> ); close $numberFH or die qq{close: $!\n}; my $rxFindTrans = do{ local $" = q{|}; qr{(@numbers)}; }; open my $transFH, q{<}, \ <<EOD or die qq{open: $!\n}; some rubbish lines <BEGIN Transaction> blurfl 154876543 <END Transaction> more rubbish <BEGIN Transaction> the one we want 154216722 with more stuff <END Transaction> <BEGIN Transaction> blargh 54211548 <END Transaction> EOD { local $/ = qq{<END Transaction>\n}; while( <$transFH> ) { s{.*(?=<BEGIN Transaction>)}{}s; next unless m{$rxFindTrans}; print qq{Found $1 in:\n$_}; print qq{==================\n}; } } close $transFH or die qq{close: $!\n};

The output.

Found 154216722 in: <BEGIN Transaction> the one we want 154216722 with more stuff <END Transaction> ==================

I hope this is helpful.

Cheers,

JohnGG

Replies are listed 'Best First'.
Re^2: Match a list of numbers from a list of files
by shawshankred (Sexton) on Dec 18, 2008 at 00:49 UTC
    Thanks a lot JohnGG, I will try this out and let you know. Appreciate your help.
Re^2: Match a list of numbers from a list of files
by shawshankred (Sexton) on Dec 18, 2008 at 20:36 UTC
    I am not sure I am loading the numberFH correctly. I need to get that from a file and also the transaction files. Here is my code, not sure if this is right.
    open IN1,"cat $DATA_DIR/$INFILE_NAME |" or die "Can't open $INFILE_NAM +E: $!\n"; while ($line1 = <IN1>) { chomp(my @numbers = $line1); } close(IN1); my $rxFindTrans = do{ local $" = q{|}; qr{(@numbers)}; }; open IN2,"cat $TransLogs/$FILE_NAME |" or die "Can't open $FILE_NAME: +$!\n"; local $/ = qq{</DU>\n}; while( <IN2> ) { s{.*(?=<DU>)}{}s; next unless m{$rxFindTrans}; print qq{Found $1 in:\n$_}; print qq{==================\n}; } close(IN2);

      There's no need to pipe cat into your filehandles as you can open files directly; the three-argument form with lexical filehandles is recommended practice.

      Instead of

      open IN1,"cat $DATA_DIR/$INFILE_NAME |" or die "Can't open $INFILE_NAM +E: $!\n";

      do

      open my $in1FH, q{<}, $DATA_DIR/$INFILE_NAME or die qq{Can't open $INFILE_NAME: $!\n};

      As you suspected, you are not reading the numbers file correctly. From your original post it looks like you have 5000 or so numbers in a file, one per line. If you assign the readline into an array rather than a scalar then the whole file is read into the array, one line per element. Furthermore, chomping an array will remove the line terminator from every element in the array. You could read the file line by line in a while loop instead if you like but then you would have to push each line onto the array. These two bits of code are equivalent.

      Using a loop

      my @numbers = (); while( <$in1FH> ) { chomp; push @numbers, $_; }

      Reading directly into an array.

      chomp( my @numbers = <$in1FH> );

      You have removed the bare code block around the reading of the second file. It was there so that the local $/ ... really was localised to that scope to avoid possible side effects later in your script. Since you have a lot of files to read you could perhaps do something like

      my @filesToRead = ( populate this list somehow ); ... foreach my $file ( @filesToRead ) { open my $in2FH, q{<}, $file or die qq{Can't open $file: $!\n}; local $/ = qq{</DU>\n}; while( <$in2FH> ) { ... } close $in2FH or die qq{Can't close $file: $!\n}; }

      I hope this is helpful.

      Cheers,

      JohnGG

        This worked. Thanks again for all your help. This script is saving me a lot of time.