Sharing data "cache" between forked processes

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I'm trying to implement/adopt a "cache" for frequently accessed data which is currently loaded from and stored to files on each access/modification.

Situation

The data loaded from files is represented in hashes (one hash per file). The primary purpose of the "cache" would be to hide the reading/writing of files from the rest of the program, so that data retrieval and modifications stay reasonably fast even when there are high disk loads. The program is a forking server, so the "cache" needs to be shared across multiple processes.

Since I don't see a great overlap between traditional perl caches (CHI, Cache and similar) and my needs (essentially an in(-shared)-memory database with reading and writing entries to files on specific conditions), I don't think using those would be wise.

When trying to implement this "cache" I've considered using some sort of shared-memory module which could store a hash of hashes. On the hash of hashes in shared memory front, I've looked into IPC::MMA and IPC::Shareable but neither seems to fit the bill. IPC::MMA can only store scalars in its hash structures, so I can't nest the hashes. IPC::Shareable has the problem of possible conflicts with the 4 char glue (I need to share lots of relatively simple hashes) and might run out of usable shared memory segments.

I also looked at in-memory databases, but I'm not sure about how that would affect memory usage (I imagine anything in retrieved from a table will be copied), and all databases I've looked at would need a ramdisk, since they don't support in-memory connections from multiple processes.

Question

Primarily, I'd like to ask if you can recommend a perl module which supports nesting hashes in shared memory and doesn't suffer from the limitations IPC::Shareable has, however I'm open to suggestions of alternative approaches to solving my main issue (the "cache").

Comment on Sharing data "cache" between forked processes

Replies are listed 'Best First'.
Re: Sharing data "cache" between forked processes (MCE!) by 1nickt (Canon) on Nov 23, 2018 at 14:37 UTC
Hi, the correct solution depends on your specific needs, but marioroy's Perl Many-Core Engine offers several options. Please see MCE::Shared, MCE::Shared::Hash, MCE::Shared::Minidb, MCE::Shared::Cache. From what I understand from your post, you basically want a shared DB where individual keys can be handled as with a cache, but sub-keys can also be accessed. Presumably you also need to be able to search for a key or keys by the value(s) of a sub-key or sub-keys). You might like: use strict; use warnings; use feature 'say'; use Data::Dumper; use MCE::Shared; my $db = MCE::Shared->minidb(); my %hash = ( problem => 'foo', technique => 'blorgle', answer => 41 ); my %junk = ( problem => 'bla', technique => 'blargle' ); $db->hset( my_key => %hash ); $db->hset( junkey => %junk ); # sorry for the bad pun my $pid = fork; die 'Fork failed' if not defined $pid; if ( $pid == 0 ) { # child $db->happend( my_key => (problem => 'bar')); $db->hincr( my_key => 'answer'); $db->hset( my_key => (technique => 'frobnicate') ); exit; } # parent wait; my @rows = $db->select_href(':hashes', ':WHERE answer > 0'); say Dumper \@rows; __END__ [download] Output: `$ perl monks/1226220.pl $VAR1 = [ [ 'my_key', { 'answer' => 42, 'problem' => 'foobar', 'technique' => 'frobnicate' } ] ];` [download] (A note from the doc that helps explain why the unfamiliar query syntax: "Several methods take a query string for an argument. The format of the string is described below. In the context of sharing, the query mechanism is beneficial for the shared-manager process. It is able to perform the query where the data resides versus the client-process grep locally involving lots of IPC.") Hope this helps! The way forward always starts with a minimal test.	[reply] [d/l] [select]
Re: Sharing data "cache" between forked processes by hippo (Archbishop) on Nov 23, 2018 at 13:55 UTC
all databases I've looked at would need a ramdisk, since they don't support in-memory connections from multiple processes. Doubtless I am misunderstanding here but does the memory engine of MariaDB not satisfy your requirements? I've used it for lower-latency stores in several projects (including FCGI-based access) without problems. Could you explain where this falls short for you?	[reply]
Re: Sharing data "cache" between forked processes by cavac (Prior) on Nov 23, 2018 at 12:52 UTC
Really depends on your needs. But just to shamelessly plug my own stuff: Interprocess messaging with Net::Clacks Net::Clacks implements real time messaging as well as a memory-only cache. Basically, if you read a file, you could just store() it in Clacks as Base64. Structures could be encoded with JSON::XS + Base64. At least, that's how i'm doing it. If you want to handle the file loading/saving/deleting on the server side for some reason, it would be sort of trivial to implement. Just adding some flags to the OVERHEAD command handling in Net::Clacks::Server.pm should do the trick. `perl -e 'use MIME::Base64; print decode_base64("4pmsIE5ldmVyIGdvbm5hIGdpdmUgeW91IHVwCiAgTmV2ZXIgZ29ubmEgbGV0IHlvdSBkb3duLi4uIOKZqwo=");'`	[reply] [d/l]
Re: Sharing data "cache" between forked processes by kschwab (Vicar) on Nov 23, 2018 at 13:53 UTC
Cache::FastMmap: https://metacpan.org/pod/Cache::FastMmap Though it will freeze/thaw using Storable, which might slow you down a bit.	[reply]
Re: Sharing data "cache" between forked processes by localshop (Monk) on Nov 26, 2018 at 05:43 UTC
I've had good results using CHI but I'd also first look at Hippo's suggestion of solutions in the DB - almost all of them allow either pinning tables to memory or using an in-memory engine. You can even put the backend DB's on a RAM Drive, but if you're not interested in solving this at the persistence layer then I'd suggest looking at CHI as well as the other suggestions. CHI has been around a long time and although it hasn't seen much updating recently it's proven in many production environments - there's also some interest in porting it to Perl6 which I assume is a good thing.	[reply]