Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello all
I've been working on revamping a print quota script I put together a while ago. I'm reading an accounting file, and am looking to see what print jobs need charging. My accounting file looks like this:
START 'A=root@styx+999' 'P=Black_Hole' 'n=root' 'H=styx.engr.colostate +.edu' 'j=999' 'D=1083876690' 'PG=2' 'CP=1' 'PGST=Simplex' 'PGSZ=Lette +r' 'DT=debug' 'PD=laserjet 4 plus' filestart '-q21004' '-p21838' '-t2004-05-06-14:51:32.488' '-Aroot@styx ++999' '-nroot' '-PBlack_Hole' fileend '-b2' '-T99' '-q21004' '-p21840' '-t2004-05-06-14:53:09.467' ' +-Aroot@styx+999' '-nroot' '-PBlack_Hole' END 't=99' 'p=2' 's=21838' 'q=21840' 'D=1083876789' 'A=root@styx+999' +'P=Black_Hole' 'n=root' 'H=styx.engr.colostate.edu' 'j=999' 'S=108387 +6690' 'PG=2' 'CP=1' 'PGST=Simplex' 'PGSZ=Letter' 'DT=debug' 'PD=laser +jet 4 plus' START 'A=root@styx+10' 'P=Black_Hole' 'n=root' 'H=styx.engr.colostate. +edu' 'j=010' 'D=1083876799' 'PG=1' 'CP=1' 'PGST=Simplex' 'PGSZ=Letter +' 'DT=acct' 'PD=laserjet 4 plus' filestart '-q21015' '-p21840' '-t2004-05-06-14:53:21.490' '-Aroot@styx ++10' '-nroot' '-PBlack_Hole' fileend '-b1' '-T22' '-q21015' '-p21841' '-t2004-05-06-14:53:41.467' ' +-Aroot@styx+10' '-nroot' '-PBlack_Hole' END 't=22' 'p=1' 's=21840' 'q=21841' 'D=1083876821' 'A=root@styx+10' ' +P=Black_Hole' 'n=root' 'H=styx.engr.colostate.edu' 'j=010' 'S=1083876 +799' 'PG=1' 'CP=1' 'PGST=Simplex' 'PGSZ=Letter' 'DT=acct' 'PD=laserje +t 4 plus' START 'A=root@styx+20' 'P=Black_Hole' 'n=root' 'H=styx.engr.colostate. +edu' 'j=020' 'D=1' 'PG=8' 'CP=1' 'PGST=Simplex' 'PGSZ=Letter' 'DT=acc +ounting.pl' 'PD=laserjet 4 plus' START 'A=root@styx+150' 'P=Black_Hole' 'n=root' 'H=styx.engr.colostate +.edu' 'j=150' 'D=1083881617' 'PG=1' 'CP=2' 'PGST=Duplex' 'PGSZ=Letter +' 'DT=Microsoft Word - New Microsoft Word Document.doc' 'PD=laserjet +4 plus' filestart '-q21155' '-p21849' '-t2004-05-06-16:13:38.562' '-Aroot@styx ++150' '-nroot' '-PBlack_Hole' START 'A=root@styx+233' 'P=Black_Hole' 'n=root' 'H=styx.engr.colostate +.edu' 'j=233' 'D=1083885286' 'PG=1' 'CP=2' 'PGST=Duplex' 'PGSZ=Letter +' 'DT=Microsoft Word - New Microsoft Word Document.doc' 'PD=laserjet +4 plus' filestart '-q21238' '-p21851' '-t2004-05-06-17:14:47.616' '-Aroot@styx ++233' '-nroot' '-PBlack_Hole' fileend '-b2' '-T66' '-q21238' '-p21853' '-t2004-05-06-17:15:52.594' ' +-Aroot@styx+233' '-nroot' '-PBlack_Hole' END 't=3669' 'p=2' 's=21849' 'q=21851' 'D=1083885286' 'A=root@styx+150 +' 'P=Black_Hole' 'n=root' 'H=styx.engr.colostate.edu' 'j=150' 'S=1083 +881617' 'PG=1' 'CP=2' 'PGST=Duplex' 'PGSZ=Letter' 'DT=Microsoft Word +- New Microsoft Word Document.doc' 'PD=laserjet 4 plus' END 't=66' 'p=2' 's=21851' 'q=21853' 'D=1083885352' 'A=root@styx+233' +'P=Black_Hole' 'n=root' 'H=styx.engr.colostate.edu' 'j=233' 'S=108388 +5286' 'PG=1' 'CP=2' 'PGST=Duplex' 'PGSZ=Letter' 'DT=Microsoft Word - +New Microsoft Word Document.doc' 'PD=laserjet 4 plus' START 'A=root@styx+174' 'P=Black_Hole' 'n=root' 'H=styx.engr.colostate +.edu' 'j=174' 'D=2' 'PG=72' 'CP=1' 'PGST=Simplex' 'PGSZ=Letter' 'DT=a +cct' 'PD=laserjet 4 plus' filestart '-q30179' '-p21860' '-t2004-05-11-15:04:25.063' '-Aroot@styx ++174' '-nroot' '-PBlack_Hole' START 'A=root@styx+184' 'P=Black_Hole' 'n=root' 'H=styx.engr.colostate +.edu' 'j=184' 'D=3' 'PG=1' 'CP=1' 'PGST=Simplex' 'PGSZ=Letter' 'DT=ac +ct' 'PD=laserjet 4 plus' filestart '-q30189' '-p21860' '-t2004-05-11-15:04:50.058' '-Aroot@styx ++184' '-nroot' '-PBlack_Hole' fileend '-b1' '-T42' '-q30189' '-p21861' '-t2004-05-11-15:05:30.041' ' +-Aroot@styx+184' '-nroot' '-PBlack_Hole'
Each successful print job gets four lines:

START blah blah blah
filestart blah blah blah
fileend blah blah blah
END blah blah blah

The START line will always be there, the others may or may not.
I'm trying to build a hash of hashes that allows me to only charge for jobs that got printed, and not charge twice (or thrice) for jobs that tried 2-3 times and died. Here's my chunk of code I've been playing around with:
# Open the acct file, and build a hash of hashes to operate on # the outer hash will be built on the job number to reconcile # multiple tries on a single job. The inner hashes will contain # job information. open ACCT, "+<$ARGV[0]"; while (<ACCT>){ chomp; if (/^START.*'j=0?0?(\d+)'.+'D=(\d+)'/){ $jobs{$1}{start_string} = $_; $jobs{$1}{start_date} = $2; } if(/^[a-z]*start.*'-p(\d+)'.+'-A.+\+(\d+)'/){ $jobs{$2}{start_counter} = $1 if $jobs{$2}; } if(/^[a-z]*end.*'-p(\d+)'.+'-A.+\+(\d+)'/){ $jobs{$2}{end_counter} = $1 if $jobs{$2}; } if(/^END.*'j=0?0?(\d+)'/){ delete $jobs{$1} if $jobs{$1}; } } close ACCT; #print Dumper(%jobs); + + # Operate on the hash. reconcile jobs, and put appropriate entries # into the acct file to forestall further charging. #foreach $job (sort { $jobs{$job}{start_date}<=>$jobs{$job}{start_date +} } keys %jobs){ foreach $job (sort {$jobs{$job}{start_date}<=>$jobs{$job}{start_date}} + keys %jobs){ print "Job: $job\n"; print " Start Date: $jobs{$job}{start_date}\n\n"; }
My output looks like this:
Use of uninitialized value in hash element at ./newaccounting.pl line +36. Use of uninitialized value in hash element at ./newaccounting.pl line +36. Use of uninitialized value in hash element at ./newaccounting.pl line +36. Use of uninitialized value in numeric comparison (<=>) at ./newaccount +ing.pl line 36. Use of uninitialized value in numeric comparison (<=>) at ./newaccount +ing.pl line 36. Use of uninitialized value in hash element at ./newaccounting.pl line +36. Use of uninitialized value in hash element at ./newaccounting.pl line +36. Use of uninitialized value in numeric comparison (<=>) at ./newaccount +ing.pl line 36. Use of uninitialized value in numeric comparison (<=>) at ./newaccount +ing.pl line 36. Use of uninitialized value in hash element at ./newaccounting.pl line +36. Use of uninitialized value in hash element at ./newaccounting.pl line +36. Use of uninitialized value in numeric comparison (<=>) at ./newaccount +ing.pl line 36. Use of uninitialized value in numeric comparison (<=>) at ./newaccount +ing.pl line 36. Job: 174 Start Date: 2 Job: 184 Start Date: 3 Job: 20 Start Date: 1
I've been unable to get the sort to work. To make this whole thing successful, I need to sort the jobs by date which is $jobs{$job}{start_date}.

Clear as mud?

Any help would be appreciated. I'm hitting a brick wall.

Thanks Much!
Louie

Replies are listed 'Best First'.
Re: Sorting a nested hash on internal values
by dave_the_m (Monsignor) on May 11, 2004 at 22:04 UTC
    foreach $job (sort { jobs{$job}{start_date} <=> $jobs{$job}{start_date}} keys %jobs) {
    I think you want
    foreach $job (sort { jobs{$a}{start_date} <=> $jobs{$b}{start_date}} keys %jobs){
    Also, I'm not sure the logic is correct in the /END/ section; do you really want to delete any job that has an END section? Surely instead you want, at the end, to delete any jobs that didn't have an END section ?
      Yes, $jobs{$a}{start_date}<=>$jobs{$b}{start_date} is what I wanted. I woke up this morning, and that was in my head. I feel a little silly.

      As for the END logic: Any job that doesn't have an /END/ has not been dealt with (no quota has been subtracted), So I want to keep non-/END/ jobs to be dealt with. I think I ought to just flag the /END/ers as dealt with, and not delete them though.

      Thanks for your help. I really appreciate someone pouring over my copious amounts of text to help out.

      Louie