great_riyaz has asked for the wisdom of the Perl Monks concerning the following question:

Hi Guys,
I am a QA trying to use perl to achieve following:
I have 2 files, both contain test names (contains .tst), and some more lines with other details for each test. I want to find out test name entries common to both files, and store it in array to be used later.
I can do this in Unix using grep -Fx. But it fails in windows as hamilton grep does not have this option. I wish to achieve this in perl so that it will run ok everywhere.
code snippet:
$file1 = "./data/list1.txt"; $file1 = "./data/list2.txt"; @common_lines = `grep -Fx -f $file1 $file2 | grep .tst`;
I need help with perl code to fill in @common_lines.

contents of list1 could be like :
test1.tst not_run test3.tst time:2 sec test6.tst time:2 sec

contents of list2 could be like :
test2.tst time:3 sec test3.tst time:6 sec test1.tst time:5 sec
...upto 100000 entries.
so then @common_lines should contain {test1.tst, test3.tst, ...}
Any help / suggestion is appreciated!

Replies are listed 'Best First'.
Re: Extract common lines from 2 files
by jwkrahn (Abbot) on May 10, 2012 at 08:09 UTC
    my @common_lines = do { local @ARGV = ( 'list1.txt', 'list2.txt' ); my %data; while ( <> ) { next unless /\.tst$/; if ( @ARGV ) { $data{ $_ } |= 1; } else { $data{ $_ } |= 2; } } grep $data{ $_ } == 3, keys %data; };
Re: Extract common lines from 2 files
by Marshall (Canon) on May 10, 2012 at 10:05 UTC
    A bit different....
    #usr/bin.perl -w use strict; my $file1=<<END; test1.tst not_run test3.tst time:2 sec test6.tst time:2 sec test10009.tst timeL 39 sec END my $file2=<<END; test2.tst time:3 sec test3.tst time:6 sec test1.tst time:5 sec END open FILE1, '<', \$file1 or die "cannot open file1 $!"; open FILE2, '<', \$file2 or die "cannot open file2 $!"; my %seenFile1; while (<FILE1>) { my ($file_name) = $_ =~ (/^\s*(test\d\.tst)\s*/); $seenFile1{$file_name} = 1 if $file_name; } close FILE1; my @common; while (<FILE2>) { my ($file_name) = $_ =~ (/^\s*(test\d\.tst)\s*/); #as OP wants, save common names for other uses.... # push @common, $file_name if $seenFile1{$file_name}; } #one use is sort # foreach (sort {my ($Anum) = $a =~ /(\d+)/; my ($Bnum) = $b =~ /(\d+)/; $Anum <=> $Bnum }@common ) { print "$_\n"; } __END__ test1.tst test3.tst
Re: Extract common lines from 2 files
by great_riyaz (Initiate) on May 11, 2012 at 09:38 UTC
    Solution by jwkrahn worked for me.
    Thanks to everyone for reply!
Re: Extract common lines from 2 files
by Anonymous Monk on May 10, 2012 at 20:16 UTC

        I had a very similar question the other day. You can find my complete code on the last post of this thread.

        http://www.perlmonks.org/?node_id=968493

        The cpan utility I used is here. (though there is a way to use grep)

        http://search.cpan.org/dist/Array-Utils/Utils.pm

        Basically you need to open each file and read them into separate arrays. Then cycle through and find the matches.