comment on

Making its way around CPAN is a development version of a package, CPAN-SQLite, which is a set of modules that can be used to set up, maintain, and search through a SQLite database of the information contained in the CPAN indices on authors, modules, and distributions. Andreas has added experimental support for this in the latest development version (1.88_65) of CPAN.pm, and I'm especially interested in hearing of experiences with this from others.

CPAN.pm gets its information on CPAN authors, modules, and distributions from the CPAN indices, and currently loads all of this into memory. With more and more packages being added to CPAN, this memory footprint can be large. What CPAN::SQLite does is enable CPAN.pm to get the information it needs for a given client request through a query to a SQLite database. This particular information is then loaded into memory, so as to be easily accessible within the same session; what this means is that only the information that a user has requested previously is put into memory. This can represent a significant saving in memory usage - I've seen reductions from 60 MB to 20 MB on some systems I've tried this on after a few random queries. However, there are queries for which essentially all available information is needed to be loaded into memory, the cpan> r call within the CPAN.pm shell to get a list of all recommended updates being one example. If such a query is made, the memory footprint with and without CPAN-SQLite is comparable.

I'd be interested in hearing, first of all, if there's any problems with building and testing the package, and secondly, if you use it with CPAN.pm, if there's any problems with various types of queries. To enable CPAN-SQLite, within the CPAN.pm shell, one can do

  cpan> o conf use_sqlite 1
[download]

and then, if you like,

  cpan> o conf commit
[download]

to keep this setting for future use. The first time this is used the database should be created under the cpan_home entry of CPAN::Config (the same location where Metadata is found), whereas subsequent invocations should just update the database.

I'd be also interested in hearing any ideas for possible extensions for this, both in extending the search capabilites of CPAN.pm and also for uses outside of CPAN.pm. Thanks!

In reply to RFC: CPAN.pm and CPAN::SQLite by randyk

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.