I had a difficult time trying to understand your code and what the intent was. Posting more data with a link did help out as I was able to look at your whole data set (And THANKS for not trying to post that directly here in the forum). It took me awhile to figure out what dtf and df etc meant in the context of your code. Starting with a description of the problem would have been even more helpful.
First, I think that you have selected the wrong type of data structure. This HoHoHoH does occur, but it is seldom. It appears to me that you want to search based upon the first column and access records that apply to that key value. If I have understood your intent correctly, you will be happier with a HoAoH. The first column (I never did figure out what physical thing that represents) is a key to 1..n records that pertain to it. I would encourage you to use longer names unless things like "tf" are just so easy to understand in your environment that no further explanation is required.
You will definitively find that using warnings and strict will help your code immensely.
Try to break the code down into function "blocks": open files, create data structure, query data structure. Also you will find that 'C' style for(;;) loops are seldom needed as the Perl foreach iterator usually can replace that in a far superior way. It will also run faster than using $array[$i] while avoiding potential "off-by-problems".
Anyway, here is my take on what I think you needed. Have fun and modify as you wish.
#!/usr/bin/perl -w
use strict;
use Data::Dumper;
open (FILE,'<', "query63.txt") || die "cannot open query63.txt";
my ($ctf,$df)=(); #pertain to many lines of input, reset occasionally
my %query=();
while (<FILE>)
{
next if /^\s*$/; #skip blank lines
my($first_col, $doclen, $tf) = split;
if ($tf eq '') #inialization to "new section of numbers"
{ #when a 2 instead of 3 parameter line
$ctf = $first_col;
$df = $doclen;
next;
}
push @{$query{$first_col}},{'doclen' => $doclen,
'tf' => $tf,
'df' => $df,
'ctf' => $ctf,
};
}
sub print_query
{
my $first_col = shift;
print "*First_col = $first_col\n";
if ( !exists($query{$first_col}) )
{
print " $first_col does not exit, query failed\n";
return;
}
#the value of $query{$first_col} is reference to an
#anonoymus array of hash references.
my $total_records = @{$query{$first_col}};
my $cur_record =1;
foreach my $href (@{$query{$first_col}})
{
print " Record $cur_record of $total_records\n";
while ( my($key,$value) = each %$href )
{
printf " %-10s => %s\n",$key,$value;
}
$cur_record++;
print "\n";
}
}
print_query (82);
print_query (3);
print_query (100);
#print Dumper \%query; #uncomment to see what this does!
__END__
*First_col = 82
Record 1 of 2
ctf => 104353
df => 42122
doclen => 1141
tf => 10
Record 2 of 2
ctf => 904777
df => 82810
doclen => 1141
tf => 30
*First_col = 3
Record 1 of 1
ctf => 904777
df => 82810
doclen => 243
tf => 7
*First_col = 100
100 does not exit, query failed
|