cpan module with binary data: when to build?

thpfft has asked for the wisdom of the Perl Monks concerning the following question:

quick question: my new Geo::Postcode - name currently subject of debate - comes with sample data in the form of a SQLite file. I can't just distribute the data file, because of binary incompatibilities (endianness, I suppose), so the data is distributed in a csv file and copied into SQLite during installation.

My question: what's the best way to do that, and at what stage?

At the moment I've got the data-writing code in Makefile.PL, wrapped in an eval so that if it fails the process will continue, Makemaker will notice the missing module and CPAN.pm will go off and install the requirements.

Doesn't really work, though. Having gone away to install the requirements, CPAN will not run the Makefile.PL again, so the data file never gets written.

And anyway, as David Cantrell has pointed out, the correct place to write the database file would be between make and make test. I have no idea how to do that. I could just hide it in the first test, of course. but do people always run make test? And anyway it would be underhanded and wrong. Bah.

any help - or good examples to copy - would be much appreciated.

thanks,
will

here's the current Makefile.PL, by the way:

#!/usr/bin/perl -w                                         # -*- perl 
+-*-

use strict;
use lib qw( ./lib );
use ExtUtils::MakeMaker;
$|++;

my $csvdata = './useful/postcodes.csv';
my $datafile = './lib/Geo/Postcode/postcodes.db';
my $tablename = 'postcodes';

create_sqlite_file() unless -e $datafile;

WriteMakefile(
    NAME => 'Geo::Postcode',
    VERSION_FROM => 'lib/Geo/Postcode.pm',
    PREREQ_PM => { 'DBD::SQLite' => 0 },
    ($] >= 5.005 ?
        (ABSTRACT_FROM => 'lib/Geo/Postcode.pm', AUTHOR => 'william ro
+ss <wross@cpan.org>') : ()
    ),
    clean => { 'FILES' => $datafile },
);

sub create_sqlite_file {
    print "Creating sample data file.\n";
    my $dbh;
    eval {
        use DBI;
        $dbh = DBI->connect("dbi:SQLite:dbname=$datafile","","");
    };
    if ($@) {
        print "Connection failed. Is DBD::SQLite installed? Makemaker 
+will tell us:\n";
        return;
    }
    
    open( INPUT, $csvdata) || die("can't open file $csvdata: $!");
    
    my @cols = split(',',<INPUT>);
    my $columns = join(', ', map { "$_ varchar(255)" } grep { $_ ne 'p
+ostcode' } @cols);
    $dbh->do("create table $tablename (postcode varchar(12) primary ke
+y, $columns);");
    
    my $counter;
    my $insert = "INSERT INTO $tablename( " . join(',',@cols) . " ) va
+lues ( " . join(',', map { '?' } @cols) . ")";
    my $sth = $dbh->prepare($insert);
    while (<INPUT>) {
        chomp;
        my @data = split(/,/);
        $sth->execute( @data );
        $counter++;
        print "." unless $counter % 40;
    }
    $sth->finish;
    $dbh->disconnect;
    print "done.\n$counter points imported into sample data set.\n\n";
}
[download]

Comment on cpan module with binary data: when to build? Download Code

Replies are listed 'Best First'.
Re: cpan module with binary data: when to build? by Zaxo (Archbishop) on Sep 02, 2004 at 16:01 UTC
You are running the example db copy operation in the `perl Makefile.PL` stage of building, which is too soon. The solution is to make the copy a target in the `make` stage. PREREQ having hauled in DBD::SQLite, the copy should work then. According to the ExtUtils::MakeMaker pod, you can add a target to the default build target with the lowercase parameter, `depend { all => 'example_data' }` [download] and define the target with a &MY::postamble, which I think should look something like this, `sub MY::postamble { return << "DODB"; example_data \tperl -e'\\ \t your;\\ \t code;\\ \t here;' DODB }` [download] That's untested, but something like it should work. After Compline, Zaxo	[reply] [d/l] [select]
Re^2: cpan module with binary data: when to build? (working) by thpfft (Chaplain) on Sep 02, 2004 at 20:45 UTC
update: lots of whining snipped out here. Thank you. And Smylers. After quite a lot of headbanging, and with your help, I've got something that seems to work. I can't use this: `depend => { all => 'example_data' }` [download] because it creates a single-colon rule for 'all' and there is already a double-colon rule for 'all' (and for 'test', and 'install'). The answer is to intercept the double-colon rule and prepend the extra dependency. It's ugly, but it works. The same technique is used in the Template Toolkit (ie I nicked it from there), and if it's good enough for TT... This is what I've added to the usual Makefile.PL: `package MY; sub postamble { return <<"EOF"; data: \tperl ./useful/makedb.pl EOF } sub test { my $class = shift; my $makefragment = $class->SUPER::test(@_); $makefragment =~ s/^(test ::)/$1 data/m; return $makefragment; } sub install { my $class = shift; my $makefragment = $class->SUPER::install(@_); $makefragment =~ s/^(install ::)/$1 data/m; return $makefragment; }` [download] The 'data' target just invokes an external script, as you can see. Cleaner that way, and easier for people to edit. Thanks for your help.	[reply] [d/l] [select]
Re: cpan module with binary data: when to build? by Smylers (Pilgrim) on Sep 02, 2004 at 15:59 UTC
Create a brand new makefile target, `make database` say. Have both `make test` and `make install` depend on `make database`. That way a mere `make` does not try to create the DB, but doing `make test` will (and if somebody skips testing then the `make install` will do it instead). Smylers	[reply] [d/l] [select]
Re: cpan module with binary data: when to build? by DrHyde (Prior) on Sep 02, 2004 at 15:16 UTC
I should clarify what I meant by "between make and make test" as you lovely monks don't have the context from the email discussion that thpfft and I had. To build the database requires non-standard modules (DBD::SQLite and therefore also DBI). Ideally, these would be mentioned in the Makefile.PL as dependencies so that when you `make` the module they are automagically installed. Then once they are installed, they can be used to build the database. The db needs to be built before the tests are run, as without it there will be test failures and the module will fail to install.	[reply] [d/l]