OK, thanks for posting some data. In future, please post your data inside <code></code> tags, like code.
The following solution uses some lightweight modules to do a lot of the work. Since you are testing scientific results I thought it would be appropriate to use part of Perl's testing framework. Test::Differences compares two data structures to see if they are identical and reports where they differ, if they are not identical.
This script writes test results (tests are named for the file they are testing) to a composite log file, and also writes test failure diagnostics (a diff of the two files) to an individual log for each file. (The only thing I don't like is that you get left with zero-byte failure logs if there were no failures).
It assumes you want to strip the filenames as shown; change the regexp to suit. It also makes up a directory for the reciprocal files called 'Recip' and for the original blast files called 'Lab' -- change to suit.
My data files for testing:#!/usr/bin/perl use strict; use warnings; use File::Find::Rule; use Path::Tiny qw/ path /; use Test::More; use Test::Differences; # log of all tests Test::More->builder->output( 'test_results.txt' ); # Get all the files we want to compare my $rule = File::Find::Rule->new; $rule->file->name('*.Recip.blast.top'); my @files = $rule->in( 'Recip' ); foreach my $rcp_file ( @files ) { # make a new path for the original (lab results) file and # strip the unwanted string from the end of the filename ( my $org_file = $rcp_file ) =~ s/^Recip/Lab/; $org_file =~ s/.Recip.blast.top//; # designate an individual test failure log ( my $err_log = "test_failure.$org_file.txt" ) =~ s/Lab\///; Test::More->builder->failure_output( $err_log ); # Get the content of the two files, extract the wanted strings # to be compared, and store in arrays my @rcp_lines = path( $rcp_file )->lines({ chomp => 1 }); @rcp_lines = map { join(' ', (split '\|')[1,5]) } @rcp_lines; my @org_lines = path( $org_file )->lines({ chomp => 1 }); @org_lines = map { join(' ', (split '\|')[5,1]) } @org_lines; # run the tests eq_or_diff( \@rcp_lines, \@org_lines, $org_file); } done_testing; __END__
$ cat Recip/do.re.mi.fa.so.la.ti.1.Recip.blast.top gi|110123922|gb|EC817325.1|EC817325 gi|110095377|gb|EC788780.1|EC78878 +0 gi|110123921|gb|EC817324.1|EC817324 gi|110105430|gb|EC798833.1|EC79883 +3 6 gi|110123920|gb|EC817323.1|EC817323 gi|110106464|gb|EC799867.1|EC79986 +7
$ cat Recip/do.re.mi.fa.so.la.ti.2.Recip.blast.top gi|110123922|gb|EC817325.1|EC817325 gi|110095377|gb|EC788780.1|EC78878 +0 gi|110123921|gb|EC817324.1|EC817324 gi|110105430|gb|EC798833.1|EC79883 +3 6 gi|110123920|gb|EC817323.1|EC817323 gi|110106464|gb|EC799867.1|EC79986 +7
$ cat Lab/do.re.mi.fa.so.la.ti.1 gi|110095377|gb|EC788780.1|EC788780 gi|110123922|gb|EC817325.1|EC817 +325 gi|110105430|gb|EC798833.1|EC798833 6 gi|110123921|gb|EC817324.1|EC817 +324 gi|110106464|gb|EC799867.1|EC799867 gi|110123920|gb|EC817323.1|EC817 +323
And the output:$ cat Lab/do.re.mi.fa.so.la.ti.2 gi|110095377|gb|EC788780.1|EC788780 gi|110123922|gb|EC817325.1|EC817 +325 gi|210105430|gb|EC798833.1|EC798833 6 gi|110123921|gb|EC817324.1|EC817 +324 gi|110106464|gb|EC799867.1|EC799867 gi|110123920|gb|EC817323.1|EC817 +323
$ perl 1141288.pl $
$ cat test_results.txt ok 1 - Lab/do.re.mi.fa.so.la.ti.1 not ok 2 - Lab/do.re.mi.fa.so.la.ti.2 1..2
Hope this helps!$ cat test_failure.do.re.mi.fa.so.la.ti.2.txt # Failed test 'Lab/do.re.mi.fa.so.la.ti.2' # at 1141288.pl line 40. # +----+--------------------------+--------------------------+ # | Elt|Got |Expected | # +----+--------------------------+--------------------------+ # | 0|[ |[ | # | 1| '110123922 110095377', | '110123922 110095377', | # * 2| '110123921 110105430', | '110123921 210105430', * # | 3| '110123920 110106464' | '110123920 110106464' | # | 4|] |] | # +----+--------------------------+--------------------------+ # Looks like you failed 1 test of 2.
In reply to Re: How to do a reciprocal matching statement
by 1nickt
in thread How to do a reciprocal matching statement
by ajl412860
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |