Stamp_Guy has asked for the wisdom of the Perl Monks concerning the following question:

Hello,
If you've been reading my other nodes, you probably know that I've been learning how to use DBM and I'm pretty new at this. I currently have a database with the following structure:
CATALOG NUMBER (this is the hash key) | PIPE-DELIMINATED DATA | SORT NUMBER
The database is accessed in several ways: The catalog number structure makes it seemingly impossible to sort by it (please correct me if I'm wrong!). It contains numbers like 219 or 614 but also ones like 751a or 930c. In addition, towards the end, there are numbers like C18 or E9. The subscripted numbers (ones like 751a) always come after their numerical counterparts (like 751). Example: 751a would come after 751. The catalog numbers with a prefix in front of them (like C18) always come at the end of the database. I can't do away with the catalog numbering system but I can't seem to figure out how to sort by it either.

My problem is this: I need to be able to add and delete keys from the database. When I add a key, how do I know what sort number I should use?

I would appreciate any suggestions of ways to fix this problem, or ways to simplify/optimize the way I currently do this. Thanks!

Stamp_Guy
Why is "abbreviated" such a long word?

Note: If further information would help, please /MSG me in the CB.

Replies are listed 'Best First'.
Re: Database Structure/Sorting/Displaying
by danger (Priest) on Jul 04, 2001 at 22:52 UTC

    Well, I'll just address your sorting question/issue--- you can sort on the catalogue number by dividing each one into 3 components: leading non-digits (if any), digits, trailing non-digits (if any) ... and then defining the sort routine to sort on these components. Here's an off the cuff Schwartzian Transform applied to the data:

    #!/usr/bin/perl -w use strict; my @list = qw/715b E9 715a 614 715 C18 1 4 12a 12/; my @sorted = map{$_->[0]} sort{$a->[1] cmp $b->[1] || $a->[2] <=> $b->[2] || $a->[3] cmp $b->[3]} map{[$_,/(\D*)(\d+)(\D*)/]} @list; print join("\n", @sorted),"\n";

    Knowing how to sort on catalogue numbers should relieve you of the need to use a special sort N field in the data. Having said that, you might also want to look into the DB_File dbm system which has a BTREE mode for keeping entries in sorted order (you can provide a comparison routine), and it has facilities for matching partial keys which can be used to code for retrieving ranges.

      Hmmm. Ok, I didn't realize that was possible (but I shoulda figured... it's Perl :-) ). Here's a question though: if I don't have DB_File, and I want to display a range of numbers, say 614 - C18, how would I be able to do it without the sort field?

        Since you can sort on catalogue number, you can loop over the sorted keys and skip the ones not in the range:

        foreach my $key (@sorted) { next unless $key eq '614' .. $key eq 'C18'; print "$key: $db{$key} \n"; }

        Assuming the hash is %db. I suspect you were doing something similar but using the sort N field? The downfall is that you need to get the whole list of keys into memory --- with DB_File that could be avoided.

Re: Database Structure/Sorting/Displaying
by Stamp_Guy (Monk) on Jul 05, 2001 at 20:29 UTC
    The more I'm thinking of it, sorting by catalog number doesn't seem to work because of yet another variable I forgot to mention: sometimes it is a group of catalog numbers that is set as one key, as in 756-765 or 750/751. Where could I find some really good info about using DB_File?