http://qs1969.pair.com?node_id=1149760

bagyi has asked for the wisdom of the Perl Monks concerning the following question:

I have a script that is executed for every process Ai is done. For next process Ai+1 I need to have some information from past process execution.

Currently since I'm logging process execution, I'm using log file also as history/state. I've looked into storable module and it looks promising in that it's binary format and data can be easily read back/written to without parsing or needing to design textual format.

What is the usual approach perl monks would use for this kinda of cases? i think smalltalk approach of snapshot and resuming is pretty cool. Can we do that in perl too. ?

Replies are listed 'Best First'.
Re: Storing state of execution
by stevieb (Canon) on Dec 09, 2015 at 14:07 UTC

    I've used Storable with great success. Another method I've taken to lately is using JSON, which stores in plain text, but is cross-language (I can write in Perl/Python/insert-language-here, then open it back up with any other one). You could also use Data::Dumper to store and retrieve state (Perl only).

      I would add that which you want depends a lot on the data you need to store.

      If the data is simple and modestly sized, then JSON (or YAML) would probably be best.

      If your data includes reference loops or binary data, or if the data structure is large (speed becomes an issue) then Storable would probably be best.

      Dumper is more of a middle ground, if you need it to be coder-readable and it is a complex data structure, but you can absolutely trust the source of the data when you read it back. I'd only recommend it as debug output, since reading it back in involves running arbitrary perl code.

        or if the data structure is large (speed becomes an issue) then Storable would probably be best.

        That doesn't match my memory. A quick test showed JSON::XS taking just over 1/3 of the time of Storable (and producing almost exactly the same number of bytes of output).

        Using JSON has other advantages. And I consider forcing one to stick to simple data to be one of them.

        - tye        

      One big problem of Storable is that its exact file format depends on the perl version and on the machine perl was compiled for. Changing the processor architecture and/or the perl version begs for trouble.

      Data::Dumper generates executable perl code that has to be parsed back into the program using string eval. That works, sure, but it is a security nightmare: Imagine someone inserting system "rm -rf /" into the saved dump.

      Data::Dumper does not dump everything, sometimes, it just generates dummy code:

      >perl -MData::Dumper -E 'my $double=sub { return 2*shift }; say Dumper +($double)' $VAR1 = sub { "DUMMY" };

      JSON, XML, and YAML don't have those problems. They simply don't allow code references, and they all are independant from the perl version and the processor architecture.

      XML can't store binary data, because some characters (0x00) are not allowed in XML, not even in escaped form. You have to resort to using a hex dump, base64 or quoted-printable encoding.

      XML stores some data multiple times (opening and closing tags contain the element name), wasting more disk space than other formats.

      JSON has data types (string, number, array, key-value pairs, booleans, and null alias undef). It lacks some higher data types, most commonly a date and time type. Usually, one uses strings or key-value pairs ("objects") for that, but you could also use a number (counting days or seconds since an epoch value). Reading back JSON with dates in strings or objects requires some knowledge about the data. You need to know if a string is a date in disguise or just a string.

      JSON does not define comments. Some JSON parsers allow comments. JSON::XS uses shell-style # comments, but that does not fit into a Javascript context (from which JSON is derived). Javascript has /* */ and // comments, that would make the most sense to use in JSON.

      YAML: I can't get it into my head. There are at least two or three ways to represent the same information, and some just don't make sense to me. I try to avoid YAML.

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
        Data::Dumper does not dump everything, sometimes, it just generates dummy code
        Unless you specify
        $Data::Dumper::Deparse = 1;
        ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
        afoken is used to give very esaustive explainations. ++ as always. But is not worth to mention also Sereal ? I have used it with profit, but i have not touched his limits because was a plain usage of it.

        have you experence with this also?

        L*

        PS What they say about their module is definetively intriguing! see Sereal Comparison Graphs

        L*
        There are no rules, there are no thumbs..
        Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

        ++ That's a spectacular explanation.