Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have 2 tab delimited flattext files. One contains all the major info that I want to search from, the other the info that I want to sort by. They have a common ID column though not identical) so it would seem to be quite straightforward to do this.

In the following code I have put the main db into an array, and the sorting db into a hash. the idea is that on iterating through the array I check to see if the ID exists in the hash and if so put it into the sort field. If it doesn't exist sort field is undefined.

The problem is: it doesn't work!! If anyone could help me achieve this that'd be really great.

Thanks.
my %sort_file; open (FILE, "$data") or die $!; @all=<FILE>; close (FILE); my $points_total= "/path/points_total.txt"; open FH, "$points_total" or die "Can't open $points_total: $!"; flock (FH, 1) or die "Can't lock $points_total for reading: $!"; my @fh=<FH>; close FH; #create hash of sort file foreach (@fh) { chomp; ($checkIDNumber, $total_points) = split "\t",$_; $sort_file{$checkIDNumber} = $total_points; } foreach $line (@all){ $line=~s/\n//g; ($undef,$undef,$IDNumber,$Email_address,$Page_Name)=split (/\t/,$line) +; if (exists ($sort_file{$IDNumber} )) { $sortfield = $total_points; } else { $sortfield = ""; } }

Replies are listed 'Best First'.
Re: sorting an array from two sources
by pzbagel (Chaplain) on Oct 21, 2003 at 22:18 UTC

    We're going to need more information. What are you doing with $sortfield. All the code does is keep changing it based on the existance of $IDNumber in %sort_file. Did you snip too much out of your loop? Did you mean to make $sortfield a hash or array of some sort. Your logic is sound up until that last if statement which just keeps changing the value of $sortfield over and over.

    Later

    P.S. Even if you are missing too much out of your loop, I think I spotted the logic error in your if statement. You assign $total_points when what you really want to assign is $sort_file{$IDNumber} since it is storing the value of $total_points for that hash key. Change it to:

    if (exists ($sort_file{$IDNumber} )) { $sortfield = $sort_file{$IDNumber}; } else { $sortfield = ""; }
      Thanks very much for your input, pzbagel.

      I'm afraid I was careless in putting together my sample code. $sortfield is the first field of the db as follows:

      ($sortfield,$undef,$IDNumber,$Email_address,$Page_Name)=split (/\t/,$l +ine);
      What I want to achieve is each time the main db is iterated, the second db is serched for the IDNumber. If it exists the $total_points will be stored in the $sortfield for that line, otherwise it is undefined. I can then sort the main db according to the values of $sortfield.

      I tried your update above, and thought it made sense. However it didn't work so I think my logic must be awry somewhere along the line.

      Thanks once again

Re: sorting an array from two sources
by Roger (Parson) on Oct 21, 2003 at 23:57 UTC
    In the program below, I first read the main file into a hash of arrays with ID as the lookup key, and then I merge the total points in the sort file into the hash based on the ID's. A simple sort to get the sorted array of ID's, and then finally print the main data in sorted order.
    use strict; use IO::File; # load main file my %main_data; my $f = new IO::File "mainfile.txt", "r" or die "Can't open main db: $ +!"; while(<$f>) { chomp; my @res = split /\t/; $main_data{$res[2]}{data} = \@res; # 3rd field is the ID number } undef $f; # load sort reference file my $f = new IO::File "sort.txt", "r" or die "Can't open sort.txt: $!"; while (<$f>) { chomp; my ($id, $total) = split /\t/; if (exists $main_data{$id}) { $main_data{$id}{total} = $total; # insert additional values } } # sort the main db in descending order my @sorted_id = sort { $main_data{$b}{total} <=> $main_data{$a}{total} } keys %main_data; # print sorted main db print "@{$main_data{$_}{data}}\n" foreach @sorted_id;
    I have created the following simplified test data, note that as long as the 3rd field is the lookup id, it doesn't matter how many fields are in the main db.

    mainfile.txt ------------ A B 12 C D 11 E F 10 sort.txt -------- 10 120.00 11 200.00
    Note that I deliberately left out ID:12 to demonstrate the sort of undef values. The output result is:
    C D 11 E F 10 A B 12
    Note that ID:12 appears at the end, because 'undef' is less than defined values.

      Thanks for your input, Roger. I tested your solution and it works perfectly. I would still like to know where I'm falling down with my approach, though!
        Can you please post the full script? Seems like you have only posted a fraction of the code, and it doesn't even enclude the sort.