in reply to Re: How do I use the map command for this?
in thread How do I use the map command for this?

Yes, probably it was not well-written. Imagine this script:
use strict; use warnings; my %res; my $id=''; my $rest=''; my $seq=''; while (<DATA>) { chomp; if($_=~/^>(.*?)\|(.*)/) { $id=$1; $rest=$2; $seq=<DATA>; chomp $seq; } } _DATA_ >id1|Q51487|P-474-4|86-98,113-126,297-310,322-335 CSLIPDYQRPEAPVAAAYPQGQAYGQNTGAAAVPAADIGWREFFRDPQLQQLIGVALE >id2|Q51487|P-474-4|86-98,113-126,297-310,322-335 CSLIPDYQRPEAPVAAAYPQGQAYGQNTGAAAVPAADIGWREFFRDPQLQQLIGVALE >id3|Q51487|P-474-4|86-98,113-126,297-310,322-335 CSLIPDYQRPEAPVAAAYPQGQAYGQNTGAAAVPAADIGWREFFRDPQLQQLIGVALE

and I want to go to:
>id1|id2|id3|Q51487|P-474-4|86-98,113-126,297-310,322-335 CSLIPDYQRPEAPVAAAYPQGQAYGQNTGAAAVPAADIGWREFFRDPQLQQLIGVALE

And I wonder if I can use some of Perl's data structures, or the map command?

Replies are listed 'Best First'.
Re^3: How do I use the map command for this?
by hippo (Archbishop) on Jun 19, 2022 at 15:19 UTC

    As usual, TIMTOWTDI. However, keeping your initial code structure for better or worse:

    use strict; use warnings; use Test::More tests => 1; my $want = <<EOT; >id1|id2|id3|Q51487|P-474-4|86-98,113-126,297-310,322-335 CSLIPDYQRPEAPVAAAYPQGQAYGQNTGAAAVPAADIGWREFFRDPQLQQLIGVALE EOT my %res; while (<DATA>) { if (/^>(.*?)\|(.*)/s) { my $id = $1; my $rest = $2; if (exists $res{$rest}) { $res{$rest}{keys} .= "|$id"; } else { my $seq = <DATA>; $res{$rest} = { keys => ">$id", seq => $seq, } } } } my ($k) = keys %res; # You would loop over all keys is your real prog my $have = "$res{$k}{keys}|$k$res{$k}{seq}"; is $have, $want; __DATA__ >id1|Q51487|P-474-4|86-98,113-126,297-310,322-335 CSLIPDYQRPEAPVAAAYPQGQAYGQNTGAAAVPAADIGWREFFRDPQLQQLIGVALE >id2|Q51487|P-474-4|86-98,113-126,297-310,322-335 CSLIPDYQRPEAPVAAAYPQGQAYGQNTGAAAVPAADIGWREFFRDPQLQQLIGVALE >id3|Q51487|P-474-4|86-98,113-126,297-310,322-335 CSLIPDYQRPEAPVAAAYPQGQAYGQNTGAAAVPAADIGWREFFRDPQLQQLIGVALE

    🦛

Re^3: How do I use the map command for this?
by LanX (Saint) on Jun 19, 2022 at 14:18 UTC
    > _DATA_

    I can tell immediately that this doesn't run! 👎

    You should check before posting, that's the "Correct" part in SSCCE!

    And it's not clear to me which part has to be unique...

    This

    |Q51487|P-474-4|86-98,113-126,297-310,322-335

    or this

    CSLIPDYQRPEAPVAAAYPQGQAYGQNTGAAAVPAADIGWREFFRDPQLQQLIGVALE

    or both.

    This has not much to do with map, a HoA = Hash of Arrays with unique keys (well which one?) is enough.

    Or probably a HoHoA depending on your perception of unique.

    And of course also depending on your desired output order.

    Please show more effort!

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

Re^3: How do I use the map command for this?
by Anonymous Monk on Jun 19, 2022 at 13:45 UTC
    I had this older script that was doing something similar, but with two values separated by tab:
    use strict; use warnings; my %res; while (<>) { chomp; my ( $name, $rest ) = split /\t/; push @{ $res{$name} }, $rest; } for ( sort keys %res ) { print "$_:", join( ",", @{ $res{$_} } ); print "\n"; }

    but I do not know how to adapt it.