in reply to Re^3: RFC: Is this the correct use of Unicode::Collate?
in thread RFC: Is this the correct use of Unicode::Collate?

Jim,

Thank you for you input. You seem to know quite a bit about Unicode.

What I tried to ask in the original post was why 'use Unicode::Collate;' changed the meaning of characters 0..31? Everything I have read, talked about not changing the meaning of 7bit ASCII.

History of the question:

I don't know if you are familiar with the NoSQL database engine BerkeleyDB (now owned by Oracle), but I have written a pure perl replacement that performs as well. In some cases where the data portion of the key/value pair are very large, it outperforms BerkeleyDB.

Most people on this forum, believe that BerkeleyDB is free. Oracle has added some conditions that make it very expensive( our law firm's counsel ). One example: If a company employee downloads BerkeleyDB and installs it, that's okay. But as a software vendor, if I download it and install it, the company owes Oracle a fee based on number of cores and type of box. For a power7 IBM p-series with 32 cores, the license fee is $ 48,000. for the "free" BerkeleyDB.

Most of our products sell for under $ 5,000. Hard to ask a company to pay an additional $48K.

Since the PurePerlDB already exists, I was looking at adding a feature to use Unicode::Collate, but it broke other features of PurePerlDB. Unfortunately, my only solution now was to put the burden on the software developer to handle Unicode and duplicates, which is the same as BerkeleyDB.

Thanks again for your input...Ed

"Well done is better than well said." - Benjamin Franklin

  • Comment on Re^4: RFC: Is this the correct use of Unicode::Collate?

Replies are listed 'Best First'.
Re^5: RFC: Is this the correct use of Unicode::Collate?
by Anonymous Monk on Jun 24, 2012 at 11:29 UTC

    Most people on this forum, believe that BerkeleyDB is free. Oracle has added some conditions that make it very expensive( our law firm's counsel ). One example: If a company employee downloads BerkeleyDB and installs it, that's okay. But as a software vendor, if I download it and install it, the company owes Oracle a fee based on number of cores and type of box. For a power7 IBM p-series with 32 cores, the license fee is $ 48,000. for the "free" BerkeleyDB.

    Just in case anyone was wondering about it, see my take on it in Open Source License for Berkeley DB unchanged

    The situation hasn't changed with the latest Berkeley db-5.3.21 , license is essentialy the same, though there is an addition of ASM for Java (only affects java bits, doesn't affect distribution / pricing )

    But I'm not a businessman or a lawyer or work for oracle


    regarding http://www.flexbasedb.com/, I notice you don't provide html only pdf, minor hassle

    For anyone interested about PurePerlDB/FlexBaseDB, from http://www.flexbasedb.com/FlexBaseDB_Introduction.pdf

    use strict; use warnings; use FlexBaseDB; my $dirname = '/home/FlexBaseDB'; unlink glob("$dirname/*"); my $fbenv = FB_OpenENV ( EnvHome => $dirname ); ## Directory for database(s) if ( ! $fbenv ) { die "FB_OpenENV: Bad ENV\n"; } my $filename = "TestDB"; ## Test file name in Environment! my $fb = FB_OpenDB ( FB_Name => $filename, ## Unique name of database FB_ENV => $fbenv, ## reference from FB_OpenENV ); if ( ! $fb ) { die "FB_OpenDB: Bad FILE\n"; } my $key = "Hello"; my $data = "World, we're here!"; my $ret; for my $count ( 1..5 ) { $ret = FB_Write( $fb,\"$key-$count",\$data ); if ( $ret==FALSE ) { die "Write failed $FB_Error \n"; } } if ( FB_Seek( $fb,\$key, FB_FIRST ) ) { print "\nOutput:\n\n"; while( $ret ) { $key = ""; $data = ""; $ret = FB_ReadNext( $fb,\$key,\$data ); print "$key\t$data\n"; } } print "\n","=" x 54, "\n"; ## Will print statistics for your DB my @results = FB_Stat ( $fb ); for my $no ( 0 .. $#results ) { if ( substr($results[$no],0,1) eq "=" ) { $results[$no] = "=" x 54; } print "$results[$no]\n"; } print "=" x 54, "\n"; $ret = FB_CloseDB( $fb ); $ret = FB_CloseENV( $fbenv ); __END__ Output: Hello-1 World, we're here! Hello-2 World, we're here! Hello-3 World, we're here! Hello-4 World, we're here! Hello-5 World, we're here!

      Dear Monk,

      I am not a lawyer, however if you do a web search on

      "The Sneaky Sleepycat License"
      you will find many legal opinions. Whether you are right or they are, I'm not the one to ask!

      Most of our clients have IBM *ix systems, and we have to be concerned about the legal use or mis-use of our or other's software. This isn't just about my opinion!

      YMMV!

      "Well done is better than well said." - Benjamin Franklin

Re^5: RFC: Is this the correct use of Unicode::Collate?
by Jim (Curate) on Jun 24, 2012 at 18:08 UTC
    I don't know if you are familiar with the NoSQL database engine BerkeleyDB (now owned by Oracle), but I have written a pure perl replacement that performs as well. In some cases where the data portion of the key/value pair are very large, it outperforms BerkeleyDB.

    I'm familiar with NoSQL and key-value stores such as Berkeley DB. But what I'd never heard of before reading your PerlMonks post is the idiom—the trick—of modifying data to disambiguate otherwise identical keys by appending control codes or invisible characters to them. This idiom seems "weirdo" to me, just as it did to Tom, who first invoked the word to describe it.

    Is my example Perl script a fair representation of the idiom your NoSQL database software uses to disambiguate like keys?

    I'm not a database theory guru or a database programming wizard, but my gut sense is that the idiom you describe of ornamenting data with invisible control codes or other characters is fraught with problems. I understand how data modified this way would ensure uniqueness and preserve insertion order. But how then do you match such modified strings? Isn't there a better way to achieve the same objectives without altering data? Do other NoSQL database engines besides yours use this same idiom? If so, which ones?

    Jim