rynntintin has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I'm new to Perl and trying to write a script to search through an array, pulling out and concatenating lines that share a motif.

My input looks like:

00001 Description1
00002 Description2
00003 Description1

My output should look like:

00001 00003 Description1
00002 Description2

So, if the same Description is recognised on multiple lines, the numbers preceeding this Description will be clustered together on one line (seperated by a tab). I think this should be a simple task, but since I'm starting out, I'm a bit lost! Any help would be great, thanks!

Replies are listed 'Best First'.
Re: Trying to write simple program
by kennethk (Abbot) on May 17, 2010 at 13:32 UTC
    What have you tried? What hasn't worked? See How (Not) To Ask A Question. See Writeup Formatting Tips for info on how to format expected input and output using <code> tags to remove possible ambiguity.

    The clustering you describe can be accomplished using the description as a key in a hash of arrays - see perllol. Something like this:

    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; my %hash; while (my $line = <DATA>) { chomp $line; my ($number, $description) = split /\s/, $line, 2; push @{$hash{$description}}, $number; } print Dumper \%hash; __DATA__ 00001 Description1 00002 Description2 00003 Description1

    See perlreftut if you are not familiar with references. Let us know if you have trouble changing this code into something that outputs your desired format.

Re: Trying to write simple program
by toolic (Bishop) on May 17, 2010 at 13:33 UTC
    Loop through your input array, splitting on whitespace into numbers and descriptions. Stuff them into a Hash-of-Arrays data structure, with descriptions as keys and numbers as arrays. Then loop through the HoA, printing as desired:
    use strict; use warnings; my @ins = ( '00001 Description1', '00002 Description2', '00003 Description1' ); my %data; for (@ins) { my ($num, $desc) = split; push @{ $data{$desc} }, $num; } for (sort keys %data) { print join("\t", @{ $data{$_} }), "\t$_\n"; } __END__ 00001 00003 Description1 00002 Description2

    See also perldsc

Re: Trying to write simple program
by sierpinski (Chaplain) on May 17, 2010 at 13:33 UTC
    Please post what code you have so far, and what your input actually looks like (for example, are the numbers part of the description, or are they separate?)

    You'll get a much more helpful response if you can provide as much detail as possible.