in reply to match two files

The script below produces the same output as in your example output. Run it as follows (assuming the script is stored as z):
perl z tmp01 tmp02

It's formatted widespreadly for better readability.
#!/usr/bin/env perl use strict; use warnings; my $seen; while ( <> ) { next unless /^\d/; s/\s*$//; next unless m/ # [file1] [file2] ^ (\S+) # PeptideID PeptideID \s+ (\S+) # ProteinID SpectrumID (?: \s+ (\S+) # ----- Sequence )? $ /x; if ( $3 ) { $seen->{$1}->{SpectrumID} = $2; $seen->{$1}->{Sequence} = $3; } else { push @{ $seen->{$1}->{ProteinID} }, $2; } } sub frmt { print join("\t", @_) . "\n"; } frmt qw( PeptideID ProteinID SpectrumID Sequence ); foreach my $k ( sort { $a <=> $b } keys %{ $seen } ) { foreach my $p ( @{ $seen->{$k}->{ProteinID} } ) { frmt $k, $p, $seen->{$k}->{SpectrumID}, $seen->{$k}->{Sequence +}; } }

Replies are listed 'Best First'.
Re^2: match two files
by yueli711 (Sexton) on Dec 09, 2020 at 16:35 UTC

    Hello siberia-man, Thank you so much for your great help!. There are still some errors. Thank yo again and really appreciated! Best, Yue

    perl match_quick02.pl tmp01_quick tmp02_quick syntax error at match_quick02.pl line 42, near "+}" syntax error at match_quick02.pl line 44, near "}" Execution of match_quick02.pl aborted due to compilation errors.

      I suspect you used cut+paste rather than the download link, for the  +} your error shows is not in the actual code anywhere.

        Hello huck, Thank yo so much for your great help! Really appreciated! Best, Yue