G'day yueli711,
This uses the same principle as ++GrandFather described.
See Notes below for differences and other features.
#!/usr/bin/env perl
use strict;
use warnings;
use autodie;
my ($in1, $in2, $out) = qw{tmp01 tmp02 tmp11_quick};
my (%data, @headings);
{
open my $fh, '<', $in1;
while (<$fh>) {
if ($. == 1) {
push @headings, split;
}
else {
my ($pep, $prot) = split;
push @{$data{$pep}}, $prot;
}
}
}
{
my $fmt = "%-9s %-9s %-10s %-8s\n";
open my $fh_in, '<', $in2;
open my $fh_out, '>', $out;
while (<$fh_in>) {
my ($id, @rest) = split;
if ($. == 1) {
printf $fh_out $fmt, @headings, @rest;
}
else {
for (@{$data{$id}}) {
printf $fh_out $fmt, $id, $_, @rest;
}
}
}
}
Output:
PeptideID ProteinID SpectrumID Sequence
6 109521 53663 KMGEGR
7 741 53663 KPPSGK
11 681 144492 NNDALR
11 780 144492 NNDALR
20 2352 15547 SPAKPK
27 1490 55547 LHKPPK
27 1491 55547 LHKPPK
27 1492 55547 LHKPPK
28 51996 55547 LFVGRK
29 1490 55504 LHKPPK
29 1491 55504 LHKPPK
29 1492 55504 LHKPPK
30 1490 55602 LHKPPK
30 1491 55602 LHKPPK
30 1492 55602 LHKPPK
Notes:
-
This code deals with real files, not in-memory files.
-
All I/O is performed in anonymous blocks.
Files are only open while needed.
Filehandles close automatically at the end of these blocks.
-
autodie removes the need to hand-craft your own I/O exception messages.
It also won't make mistakes like you have in a later post: file with I/O problem is tmp01_quick;
message refers to a different file, i.e. donor_82_01.csv. (You have three errors like that.)
-
When I copied your data to files on my system, the tabs became spaces.
I only needed split without arguments; you should continue to use split /\t+/.
You should also take note of chomp
used in GrandFather's code.
-
I've used printf to improve output formatting;
however, that may not be what you want.
You should also take a look at Text::CSV.
(Note, it runs faster if you also have Text::CSV_XS installed.)
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.