Hi perl monks, I am trying to compare 2 files with similar data. I want to check if any of the lines in file 1 are contained in File2 and if they aren't to print that line in File 1 (with an additional column 0) and if they are present to print the line from File 2
eg
File 1:
133-1452_chromosomal_replication_initiation_protein_
1457-2557_DNA_polymerase_III_subunit_beta_
2579-3670_recombination_protein_F_
3687-6104_DNA_gyrase_subunit_B_
c8268-7159_aspartate-semialdehyde_dehydrogenase_
c8692-8471_prophage_p2_ogr_protein_
File 2:
c8268-7159_aspartate-semialdehyde_dehydrogenase_ 33
c8692-8471_prophage_p2_ogr_protein_ 574
1457-2557_DNA_polymerase_III_subunit_beta_ 123
Output file:
133-1452_chromosomal_replication_initiation_protein_ 0
1457-2557_DNA_polymerase_III_subunit_beta_ 123
2579-3670_recombination_protein_F_ 0
3687-6104_DNA_gyrase_subunit_B_ 0
c8268-7159_aspartate-semialdehyde_dehydrogenase_ 33
c8692-8471_prophage_p2_ogr_protein_ 574
The files are not sorted in any way, so the lines are not consecutive in either file. I have tried adapting a number of bits of code that I have found on the web, but none of these are working properly. So far I have this:
#!/usr/bin/perl use strict; use warnings; open (OUT,">outputfile.txt"); open my $fh1, '<', 'file1.txt'; open my $fh2, '<', 'file2.txt'; while( defined( my $line1 = <$fh1> ) and defined( my $line2 = <$fh2> ) ){ chomp $line1; chomp $line2; my $string = $line2; $string =~ m{^*\t}; print $string."\n"; if( $line1 eq $string ){ print OUT $line2."\n"; }else{ print OUT $line1."\n"; } } close $fh1; close $fh2; close OUT;
But this just print out a list of lines from the second file. Any help would be appreciated!
In reply to comparing 2 files by garyboyd
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |