comment on

Evening!

I'm trying to solve the problem that looks rather standard, but yet I couldn't find trivial solution. So I'm asking your help. Here's the simple DBIx::Class schema:

package MyDatabase;
use base qw/DBIx::Class::Schema/;
__PACKAGE__->load_namespaces;
1;
[download]

package MyDatabase::Result::Artist;
use strict;
use warnings;
use base qw/DBIx::Class::Core/;
__PACKAGE__->table('artist');
__PACKAGE__->add_columns(
    artistid => { data_type => 'integer', is_auto_increment => 1, },
    name     => { data_type => 'text',    is_nullable       => 0, }
);
__PACKAGE__->set_primary_key('artistid');
__PACKAGE__->add_unique_constraint(['name']);
__PACKAGE__->has_many( 'cds' => 'MyDatabase::Result::Cd' );
1;
[download]

package MyDatabase::Result::Cd;
use strict;
use warnings;
use base qw/DBIx::Class::Core/;
__PACKAGE__->table('cd');
__PACKAGE__->add_columns(
    cdid   => { data_type => 'integer', is_auto_increment => 1, },
    artist => { data_type => 'integer', is_nullable       => 0, },
    name   => { data_type => 'text',    is_nullable       => 0, },
);
__PACKAGE__->set_primary_key('cdid');
__PACKAGE__->belongs_to( 'artist' => 'MyDatabase::Result::Artist' );
1;
[download]

In my case the "artist" table is relatively small, so I can keep it in memory without any problems, records in this table are nether deleted or changed, but new records may be added. I need to insert a lot of records into "cd" table. I'm trying to find method of doing it both elegant and effective, but so far I failed. The problem is that I can't make DBIx::Class to cache "artist" table in memory, so every time I insert new record into "cd" it performs select from "artists". Here's the script that demonstrates the problem:

use strict;
use warnings;
use Benchmark;
use MyDatabase;

my $NUM = 10000;
my $schema = MyDatabase->connect( "dbi:SQLite:dbname=cd.sqlite", '', '
+', 
    { AutoCommit => 0 } );

my @artists = map { "Artist$_" } 1 .. 20;

# recreate database
sub resetdb {
    $schema->deploy( { add_drop_table => 1 } );
    $schema->resultset('Artist')->populate( [ map { { name => $_ } } @
+artists ] );
}

#######
# This looks good and correct, but slow
sub test1 {
    resetdb();
    my $rs_cd = $schema->resultset('Cd');
    for ( 1 .. $NUM ) {
        $rs_cd->create(
            {
                artist => { name => $artists[ $_ % @artists ] },
                name   => "Cd$_",
            }
        );
    }
}

#######
# This one I found in Catalyst Advent Calendar: 
# http://www.catalystframework.org/calendar/2009/17
# Looks good, but the same problem -- it's slow
sub populate {
    resetdb();
    my @data = map { { artist => { name => $artists[ $_ % @artists ] }
+, name => "Cd$_" } } 1 .. $NUM;
    $schema->resultset('Cd')->populate( \@data );
}

#######
# This one works fast, but looks not so good, especially taking into a
+ccout
# that I have more than one related table. Also it doesn't insert new 
+artists.
sub test2 {
    resetdb();
    my $rs_cd = $schema->resultset('Cd');

    # save artists table into hash
    my %artists = map { ( $_->name, $_->artistid ) } $schema->resultse
+t('Artist')->all;
    for ( 1 .. $NUM ) {
        $rs_cd->create(
            {
                artist => $artists{ $artists[ $_ % @artists ] },
                name   => "Cd$_",
            }
        );
    }
}

#######
# This one works fast and correct, but looks awfull.
sub test3 {
    resetdb();
    my $rs_cd = $schema->resultset('Cd');

    # save artists table into hash
    my %artists = map { ( $_->name, $_->artistid ) } $schema->resultse
+t('Artist')->all;
    for ( 1 .. $NUM ) {
        my $aname = $artists[ $_ % @artists ];

        # create record for artist if it doesn't exists
        unless ( $artists{$aname} ) {
            $artists{$aname} = $schema->resultset('Artist')->create( {
+ name => $aname } )->artistid;
        }
        $rs_cd->create(
            {
                artist => $artists{$aname},
                name   => "Cd$_",
            }
        );
    }
}

timethese(
    1,
    {
        'test1'    => \&test1,
        'populate' => \&populate,
        'test2'    => \&test2,
        'test3'    => \&test3,
    }
);
__END__
Benchmark: timing 1 iterations of populate, test1, test2, test3...
  populate: 21 wallclock secs (20.38 usr +  0.05 sys = 20.43 CPU) @  0
+.05/s (n=1)
            (warning: too few iterations for a reliable count)
     test1: 20 wallclock secs (20.44 usr +  0.02 sys = 20.46 CPU) @  0
+.05/s (n=1)
            (warning: too few iterations for a reliable count)
     test2:  6 wallclock secs ( 5.39 usr +  0.01 sys =  5.40 CPU) @  0
+.19/s (n=1)
            (warning: too few iterations for a reliable count)
     test3:  5 wallclock secs ( 5.43 usr +  0.01 sys =  5.44 CPU) @  0
+.18/s (n=1)
            (warning: too few iterations for a reliable count)
[download]

Did I missed something simple? Do you have any ideas how to solve this problem?

In reply to DBIx::Class. Cache table in memory by zwon

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.