dr_jgbn has asked for the wisdom of the Perl Monks concerning the following question:

Hello Perlmonks,

I am having a problem figuring out how to use the sort function within the context of one of my scripts.
Here is the code to start off with:

use strict; use vars qw($outfile); open(DATA, $ARGV[0]) or die "Couldn't open $ARGV[0]: $!"; $outfile = "test.txt"; open (OUT, ">$outfile") || die "Can't open $outfile for creation: $!\n +"; my (%hash,@order); while(<DATA>) { chomp; my @bits = split /\t/; my $first = "$bits[0]"; push @order, $first unless exists $hash{$first}; $hash{$first} .= ' ' . "$bits[1] $bits[2]\t"; } print OUT "$_\t$hash{$_}" for @order;

So for example, the script takes a file that looks like this:
text1 text-a text-w
text2 text-b text-y
text3 text-x text-t
text1 text-d text-n
text3 text-f text-z
text3 text-e text-w
Output:
text1 text-a text-w text-d text-n
text2 text-b text-y
text3 text-x text-t text-f text-z text-e text-w

The problem that I am having is to just sort the concatenated data in each row.
i.e. I would like the data to look like this,

text1 text-a text-d text-n text-w
text2 text-b text-y
text3 text-e text-f text-t text-w text-x text-z

I should mention that $bits[1] is always a word and $bits[2] is always a word followed by a space then a number (e.g. dehydration 1)

Any help would greatly be appreciated,
Thank-you in advance,
Dr.J

Replies are listed 'Best First'.
Re: Problems sorting
by Limbic~Region (Chancellor) on Jun 01, 2003 at 15:12 UTC
    dr_jgbn,
    See if this untested code isn't what you are looking for:
    #!/usr/bin/perl -w use strict; open(INPUT,"input.txt") or die "Unable to open input file : $!"; open(OUTPUT,">output.txt") or die "Unable to open output file : $!"; select OUTPUT; my %data; my @order; while (<INPUT>) { chomp; my @field = split /\t/; push @order , $field[0] unless (exists $data{$field[0]}); push @{$data{$field[0]}} , @field[1 .. $#field]; } foreach my $key (@order) { print join "\t" , $key , sort @{$data{$key}}; print "\n"; }
    You may want to look at this for ideas how to accomplish this without reading everything into memory.

    Cheers - L~R

    Update: Now tested and verified to be correct

Re: Problems sorting
by edoc (Chaplain) on Jun 01, 2003 at 15:33 UTC

    bah! L~R beat me to it again..

    in the spirit of TIMTOWTDI...

    ..note I've swapped '\t' for ',' to make it easily copy/pasted..

    #!/usr/bin/perl my %hash; while(<DATA>){ chomp; my @line = split /,/; my $first = shift @line; push( @{$hash{$first}}, @line ); } foreach( sort keys %hash ){ print "$_\t".join(' ', sort @{$hash{$_}})."\n"; } __DATA__ text1,text-a,text-w text2,text-b,text-y text3,text-x,text-t text1,text-d,text-n text3,text-f,text-z text3,text-e,text-w text10,word,other 1 text10,aword,nother 1 text11,cword,tother 1 text11,zword,bother 1 __END__ prints: text1 text-a text-d text-n text-w text10 aword nother 1 other 1 word text11 bother 1 cword tother 1 zword text2 text-b text-y text3 text-e text-f text-t text-w text-x text-z

    Update: hmm.. this version doesn't produce the lines in the order it found them (it sorts but the first word in the line), but that's easily changed if required as per the other examples..

    cheers,

    J

      edoc,
      I don't think (sort keys %hash) is what is being asked for. It appears the original order is to be preserved, but the elements on the line sorted. I missed this my first, second, and third time reading it. I would argue that yours is easier to read though :-)

      Cheers - L~R

Re: Problems sorting
by Skeeve (Parson) on Jun 02, 2003 at 07:14 UTC
    You said: I should mention that $bits[1] is always a word and $bits[2] is always a word followed by a space then a number (e.g. dehydration 1)
    How should it be sorted? Is the simple "sort" of L~R enough? Is it of any importance to the sort problem that there is a number following the second word?