jeteve has asked for the wisdom of the Perl Monks concerning the following question:

Hi Fellow monks,

I need to release a module that will make use of binary data, ( a dump of a Text::Scan dictionary to be precise ).

I'm wondering what's the best way to package data files with a CPAN module. Data files being anything, from images to heavy pre computed data.

A great thing would be to be able to do that:

I don't know if that kind of mecanism could be implemented or it exists already.

Thanks for your thoughts and help.

Jerome

-- Chat with me live here !

Replies are listed 'Best First'.
Re: Distributing binary data along with a CPAN module
by andreas1234567 (Vicar) on Feb 21, 2008 at 14:09 UTC
    I maintain a module whose test suite contains binary data. I put the data in a separate data directory and list each file in the MANIFEST file. No trouble at all.

    However, I had trouble with endianness. Your module should be endian-aware if it is to interpret the binary data. See Numbers endianness and Width in perlport.

    Note that before perl 5.9.2 there was apparently no direct way to specify explicit interpretation (endianness) for numbers other than int and short (See the modifiers n, N, v and V to unpack). Of course, there are ways around this, e.g. to reverse the input to unpack depending on endianness.

    --
    Andreas
Re: Distributing binary data along with a CPAN module
by stiller (Friar) on Feb 21, 2008 at 15:03 UTC
    Also, you mention big data. You should read Randal's article The Big Modules in the Mini-CPAN

    Edit: actually, you didn't say big, you said heavy computed, which is another thing completely. Anyway, it's a useful read...

Re: Distributing binary data along with a CPAN module
by adamk (Chaplain) on Feb 22, 2008 at 02:34 UTC
    For binary/precomputed test data, just chuck it in t/data.

    For binary/precomputed run-time module data, you can use the matched pair of Module::Install install_share to install the files with your module, and locate them post-install with File::ShareDir.

    See CPAN for more details.

      Or if you don't like Module::Install (or Module::Build) you can embed binary data in a __DATA__ section of a file like I've done in the file lib/Number/Phone/UK/Data.pm in my Number::Phone module. That file is a tiny piece of perl code which exposes the data that follows it as a hash, using DBM::Deep.

      Just make sure you're careful when editing that file! In fact, it's best if you don't edit it. Auto-generate the header along with the data.