Re: Serialise to binary?
by davido (Cardinal) on Oct 26, 2015 at 01:26 UTC
|
BSON implements BSON - Binary JSON, which is "...a binary-encoded serialization of JSON-like documents. Like JSON, BSON supports the embedding of documents and arrays within other documents and arrays. BSON also contains extensions that allow representation of data types that are not part of the JSON spec. For example, BSON has a Date type and a BinData type."
Sounds like that could be a decent fit, particularly since it is probably more portable than Storable.
| [reply] |
|
I tried BSON, I found that it would throw a lot of warnings on basic structures, especially it seems to miss identify some scalars as floats and attempts to pack them as floats causing "argument isn't numeric in pack". I also found floats having widly inconsistent values when encoded then decoded.
Because its so heavily tied to MongoDB, they don't see to really care about having it being able to encode arbitrary data structures (evidence by the fact that you have to pass a hash ref, no array ref allowed). They just want to decode their own binary data as they use it in Mongo and re-encode structures setup the same way.
So I wouldn't recommend it.
| [reply] |
|
they don't see to really care about having it being able to encode arbitrary data structures
I think that's an unkind assumption about intent. (N.B. I am the current maintainer.)
But like JSON, BSON is document-oriented, so is not designed to store raw arrays or scalars the way Storable or Sereal will. So in that sense, it might not be the right choice for your needs.
Beyond that however, the goal of BSON is to handle whatever you can throw at it as best as possible given the ambiguities mapping data between a dynamic, largely typeless language like Perl and a typed data format like BSON. Knowing that some Perl scalar is binary data and not an arbitrary string is impossible without some hints from the programmer.
The MongoDB::BSON implementation is in XS and has been part of the MongoDB driver distribution. We hope to eventually split it out so that it can be used independently where warranted.
The BSON.pm implementation is pure Perl and was originally developed outside MongoDB (but has since been adopted by the company). There are still some areas where it is not yet as good as MongoDB::BSON.
Even if BSON is not right for this particular problem, if anyone experiences bugs using either implementation, I encourage you to report them or at least email us about them so we can fix them.
-xdg
Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.
| [reply] |
|
|
Re: Serialise to binary?
by Corion (Patriarch) on Oct 26, 2015 at 09:12 UTC
|
Depending on your data structure, you might have more luck with Sereal.
| [reply] |
Re: Serialise to binary?
by RichardK (Parson) on Oct 26, 2015 at 00:30 UTC
|
You might have to do it yourself using pack, but it does give you 16 & 32 bit ints in both big-endian & little-endian.
#from the docs
n An unsigned short (16-bit) in "network" (big-endian) order.
N An unsigned long (32-bit) in "network" (big-endian) order.
v An unsigned short (16-bit) in "VAX" (little-endian) order.
V An unsigned long (32-bit) in "VAX" (little-endian) order.
| [reply] [d/l] |
Re: Serialise to binary?
by Laurent_R (Canon) on Oct 26, 2015 at 00:08 UTC
|
Hmm, I am afraid that if you want a binary format that is cross-platform compatible, you'll have to define it yourself, I do not think there is a standard for such a format.
Well, of course, you may find an existing binary format for data exchange between a given platform pair, or even for a few platforms, but I do not think there is any standard one that is really cross-platform in the wider sense. And there are plenty of quite compelling reasons for that, one of them being that there is not one binary format, but several.
Having said that, some network protocols do define some low level binary formats that you might want to use. But I am not convinced that such low level formats fit your functional/business needs. You did not say enough on your requirements for giving a definite answer (which I would probably not be able to give anyway, I haven't work in this area for about 15 years and I don't remember enough about these things).
| [reply] |
Re: Serialise to binary?
by Your Mother (Archbishop) on Oct 25, 2015 at 23:39 UTC
|
My binary data is already compressed, so when I compress JSON or Data::Dumper output, it still ends up substantially bigger than storable.
This is highly unlikely. If you serialize without whitespace and compress and it is substantially bigger than storable…? This is not my forté but post your code and I’m sure someone can show you where it’s gone sideways.
| [reply] |
|
An example is when there are a huge number of scalars having random contents. Here compressed storable has 33% over head, where as compressed json has 70%+ overhead.
use strict;
use warnings;
use Storable;
use IO::Compress::Gzip qw(gzip);
use JSON::XS;
my (@data,$serial,$gzserial,$json,$gzjson,$i);
for($i=0;$i<100000;$i++) { push @data, chr(int(rand(256)))}
$serial = Storable::nfreeze(\@data);
$json = encode_json(\@data);
gzip \$json => \$gzjson;
gzip \$serial => \$gzserial;
print scalar(@data)."\n";
print length($serial)."\n";
print length($gzserial)."\n";
print length($json)."\n";
print length($gzjson)."\n";
| [reply] [d/l] |
|
Oh, nice! I was about to argue that one character scalars making quotation marks more than 60% of the data rigged the test in favor of Storable but I upped the "word" size and the difference remains at about 30% in favor of Storable. Sidebar: on my box at least, Storable sees *negative* change from zipping: i.e., the zipped Storable is slightly bigger than the raw nstore .
| [reply] [d/l] |
Re: Serialise to binary?
by BrowserUk (Patriarch) on Oct 26, 2015 at 15:18 UTC
|
| [reply] |
A reply falls below the community's threshold of quality. You may see it by logging in. |