Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Can someone link me to a place that shows different examples of different ways to sort a DB_BTREE? I want things to print out the order you put them in and the docs only show one example of something totally different.

Replies are listed 'Best First'.
Re: changing the printing order of a db
by dga (Hermit) on Jul 18, 2003 at 22:47 UTC

    Your question is rather vague.

    There needs to be some intrinsic bit of information in the data somewhere about how to sort it. To sort in the order it was put in, a sequentially incrementing value placed in the record (or the key) could then be used to sort the records that way later. However, if you want data out for printing in the order originally stored then perhaps a flat file would be more appropriate. Append records onto the end, then by reading the file it comes back out in the order it was put in with no database overhead at all. The hashing methods of a database are optimized for retrieving particular records in an optimized time, not really for putting out an entire data set in a predefined order.

    If you use DB_File then there is a DB_RECNO storage method which is for things with a predefined (arbritrary) order which can be fully controlled. If you are sorting by a particular algorithm then DB_File can also use a DB_BTREE to store your records and allow quick access to individual records and also get the records back in your user defined sort order if you want all or part of the entire data set.

    A possible answer to your question is that DB_BTREE will sort records in a user defined order based on an algorithm. So you may need only to include as part of the key a way for the sorting algorithm do decide which records go where (the sequentially incrementing value mentioed before may work)

    sub Compare { my($k1, $k2)=@_; $k1 cmp $k2; } ... $DB_BTREE->{'compare'} = \&Compare ;

    Now DB_BTREE will use Compare to sort the records for storage and retrieval. You can put whatever checks you want into Compare to sort however you want. This only works really cleanly for sorting parts of the keys. Another issue is that if there is a sequence number as part of the key you have to know that to get the key back unless you use the search for the next key >= to the value I am supplying. Another option is to define the compare function so that it knows about the tied hash then it can use parts of the data to do the ordering as in this example.

    #!/usr/bin/perl use strict; use warnings; use DB_File; my %hash; tie %hash, 'DB_File', undef, undef, 0666, $DB_BTREE; $DB_BTREE->{'compare'} = sub { my($k1, $k2)=@_; my($v1)=split(' , ', $hash{$k1}); my($v2)=split(' , ', $hash{$k2}); $v1 <=> $v2; }; $hash{key1}='1 , First data'; $hash{key2}='2 , Second data'; $hash{key3}='3 , Third data'; print "A random record (#2): $hash{key2}\n"; #notice no sort on the next line as the records come back in order. foreach my $k ( keys %hash ) { my($n, $v)=split(' , ', $hash{$k}); print "$k: $n: $v\n"; }

    This sub will know about the tied hash since it is declared in the scope that the hash is tied. Also the compare needs to be set before any data records are accessed or stored. This uses undef in the spot for the filename giving an in memory only copy of the database which only lives for the duration of the script.

Re: changing the printing order of a db
by jsprat (Curate) on Jul 19, 2003 at 02:13 UTC
    The default order of $DB_BTREE is lexical. If you use a date/time stamp for the key, keys will return the keys in sorted order automatically and you won't need a custom sort order - ie:

    $tied_hash{20030718190355} = $value;

    Another approach using $DB_RECNO can be found at Re: Re: Re: Re: Re: Re: DB_File, not saving.

    HTH...

      Ok, I tried your idea and am using $hash{localtime()} = $info but it prints data in a miscellaneus order. Any other possibilities?
        localtime in scalar context returns a string something like "Fri Jul 18 23:35:45 2003" which won't sort cleanly. Build the key like "YYYYMMDDHHMMSS" and the keys will be sorted on disk.

        #!/usr/bin/perl use strict; use warnings; use DB_File; unlink 'db'; # to prevent dupes if this is run more than once tie my %hash, 'DB_File', 'db', O_CREAT|O_RDWR, 0644, $DB_BTREE; print "Building \%hash...\n"; $|++; for my $count (1 .. 50) { my @time = localtime; my $key = $time[5] + 1900; $key .= sprintf "%02d", $time[$_] for reverse 0 .. 4; print "$count... " if $count % 10 == 0; $hash{$key} = "Message number $count"; sleep 1; } print "Output:\n"; print "$_: $hash{$_}\n" for keys %hash;

        Check the output, you'll see what I mean. The keys come out lexically sorted - the messages go from Message #1 to #50.

        Remember that you can have duplicate keys with $DB_BTREE. If two values are inserted with the same time stamp, there's no guarantee which way they will be retrieved (which is why I suggested $DB_RECNO).

        HTH