bsherkhane has asked for the wisdom of the Perl Monks concerning the following question:

hello monks i am new to perl..i have a tab delimited text file where each line has the following.

1 A 1 B 2 G 2 H 2 V <\p> i want the output as 1 A:B 2 G:H:V how to do it using hash ?

  • Comment on print identical keys once along with their values

Replies are listed 'Best First'.
Re: print identical keys once along with their values
by kennethk (Abbot) on Nov 04, 2015 at 17:57 UTC
    What have you tried? What didn't work? Without seeing what you've tried, it's very hard to help you debug your approach. See How do I post a question effectively?

    Rather than just using a hash, you want to use a hash of arrays, because you want to associate each key with multiple values. In this case, you might even want to use an Array of Arrays, since your keys are numeric. See HASHES OF ARRAYS in perldsc, and maybe perllol, perldata, perlreftut, and/or perlref for additional info.


    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

      As this signature is one of the best piece of advice in this field: is there a more complete collection than this newsgroup posting, apparently from dominus - perhaps even a canonical one?
        My Google skills are as extensive as yours. When I adopted the quote, I did a lot of digging to see if there were longer versions of the list, and found none.

        #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

Re: print identical keys once along with their values
by MidLifeXis (Monsignor) on Nov 04, 2015 at 18:03 UTC

    For your output, you may also want to look at join and keys. A HoH or HoA could work for collecting the data for your results.

    You may also want to put <code>...</code> around both your input and output data in order to preserve your intended formatting.

    --MidLifeXis

Re: print identical keys once along with their values
by GotToBTru (Prior) on Nov 04, 2015 at 18:04 UTC

    You might want to take kennethk's last bit of advice first. Work out the mechanism first by hand. That will inform your choice of control structure and data structure.

    Dum Spiro Spero
Re: print identical keys once along with their values
by AnomalousMonk (Archbishop) on Nov 04, 2015 at 18:40 UTC

    c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "my $pair = qr{ (\d+) \s+ ([[:alpha:]]+) }xms; ;; my $record = '1 A 1 B 2 G 2 H 2 V'; ;; my %hash; while ($record =~ m{ \G \s* $pair }xmsg) { push @{ $hash{$1} }, $2; } dd \%hash; ;; $hash{$_} = join ':', @{$hash{$_}} for keys %hash; dd \%hash; " { 1 => ["A", "B"], 2 => ["G", "H", "V"] } { 1 => "A:B", 2 => "G:H:V" }

Re: print identical keys once along with their values
by Laurent_R (Canon) on Nov 04, 2015 at 22:02 UTC
    Actually, if you only want to output the data the way you've shown, then you could do it without nested data structure and use a simple hash (or a simple array), provided you concatenate the values into the hash as you go (re-using part of AnomalousMonk's code):
    $ perl -wMstrict -MData::Dumper -e 'my $pair = qr{ (\d+) \s+ ([[:alpha +:]]+) }xms; > my $record = "1 A 1 B 2 G 2 H 2 V"; > my %hash; > while ($record =~ m{ \G \s* $pair }xmsg) { > $hash{$1} .= "$2;"; > } > print Dumper \%hash; > ' $VAR1 = { '1' => 'A;B;', '2' => 'G;H;V;' };
Re: print identical keys once along with their values
by Anonymous Monk on Nov 05, 2015 at 16:00 UTC

    Why go to the trouble of using a hash?

    #!/usr/bin/perl # http://perlmonks.org/?node_id=1146923 use strict; use warnings; $_ = join "\t", qw(1 A 1 B 2 G 2 H 2 V); 1 while s/\b((\d+)\s\S+)(.*?)\s\2\s(\S+)/$1:$4$3/; print "$_\n";
      Thank you very much, can you explain me the regex in details , like what is \2 and $1 $4 $3 special variables purpose regards, Umesh

        Hello bsherkhane,

        The escape sequences \1, \2, \3, etc., are backreferences to captures in the current regex. The special variables $1, $2, $3, etc., are likewise backreferences to the captures in the most recent regex. $1 refers to the first capture, $2 to the second capture, and so on. Captures are numbered by counting left parentheses from the left. See perlre#Capture-groups.

        The module YAPE::Regex::Explain is a useful tool for understanding regular expressions. Here is the explanation it gives for the left-hand side (i.e., the regex part) of the substitution in question:

        Hope that helps,

        Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,