in reply to Re: Storing large data structures on disk
in thread Storing large data structures on disk

I have to admit this code is too complex for me - too many shortcuts I'm unfamiliar with, but I'm doing my best to understand it :) (I guess I should move to the newbies section...)

Anyway, I can't get it to run - I get:

Bareword found where operator expected at test2.pl line 18, near "prin +tf O "%s", pack 'V/A" (Might be a runaway multi-line // string starting on line 8) (Do you need to predeclare printf?) Bareword found where operator expected at test2.pl line 18, near "', p +ack 'V" (Missing operator before V?) Global symbol "@AoA" requires explicit package name at test2.pl line 8 +. Global symbol "@AoA" requires explicit package name at test2.pl line 8 +. Global symbol "@AoA" requires explicit package name at test2.pl line 8 +. Global symbol "@AoA" requires explicit package name at test2.pl line 8 +. Global symbol "$start" requires explicit package name at test2.pl line + 8. Global symbol "@AoA" requires explicit package name at test2.pl line 8 +. syntax error at test2.pl line 18, near "printf O "%s", pack 'V/A" Bad name after raw' at test2.pl line 26.

Replies are listed 'Best First'.
Re^3: Storing large data structures on disk
by BrowserUk (Patriarch) on May 31, 2010 at 18:20 UTC
    Anyway, I can't get it to run

    As others have explained, switch our $O //= 2 to our $O ||= 2 for pre-5.10 perls.

    I have to admit this code is too complex for me - too many shortcuts

    If you have specific questions about particular lines of code, just ask. That is what this place is for.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      First of all, thanks again. I really appreciate the help from all you guys. I also appreciate how much more there is to know about this wonderful language.

      A few question re. BrowserUk's code:

      1. When I run your code with passing -O=6, for example, it also prints the structure to the screen. This does not happen when omitting the -O=... . Why is that?

      2. What is the meaning of pp here? I read in CPAN that it is used to create standalone executables, but I don't understand the connection (and moreover, why do we pass the structure to it...).

      3. Can you explain the heart of the packing:

       printf O "%s", pack 'V/A*', pack 'V*', @{ $AoA[ $_ ] };;

      we we print each array to the output file. what does the / between the V and A stand for? I can read it means for a count of the packed items, but where does it value come from? and why do we need the second pack?

      And one last question for now - when the ds becomes too large to store it all in memory, is tying with MLDBM the preferred paradigm? What are the alternatives?

      Thank you!

        1. 1. When I run your code with passing -O=6, for example, it also prints the structure to the screen.

          It should only dump the structure to the screen if -O=2 or less? See the lines that end in if $O <= 2;. There is something wrong with your copy of the code if this is not the case?

          I added that so that I could quickly check that what got unpacked was the same as what was packed. For small examples only.

        2. What is the meaning of pp here?

          If you look a the third line of code you'll see: use Data::Dump qw[ pp ];; pp in this case stands for "pretty print" and is Data::Dump's equivalent of Data::Dumper's Dumper() function.

        3. Can you explain the heart of the packing:  printf O "%s", pack 'V/A*', pack 'V*', @{ $AoA[ $_ ] };;

          Okay. First off update your copy of the code from the original node where I've switched it from printf to print.

          The guts of the thing is two calls to pack.

          • pack 'V*', @{ $AoA[ $_ ] };

            It goes through the array: @AoA (with $_ set to 0 .. $#AoA) one element at a time getting the reference to the sub-array.

            The @{ ... } bit expands the array reference to the contents of that sub-array.

            The pack format "V*" say pack all the values in the list (produced above), as unsigned integers into a binary string and return that string.

          • pack 'V/A*', ...

            The second pack template "V/A*", says return the input binary string ("A*") prefixed ('/') with a 32-bit unsigned integer ('V').

            And the print writes that out to the file.

          As your sub-arrays are variable sized, we need the prefix count so that we know how much of the file to read back into each sub-array when retrieving it.

          Note: You might prefer to use 'N' rather than 'V' if that is more natural on your platform.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re^3: Storing large data structures on disk
by ikegami (Patriarch) on May 31, 2010 at 17:56 UTC
Re^3: Storing large data structures on disk
by Anonymous Monk on May 31, 2010 at 17:34 UTC
    Did you use the download code link? You probably need perl 5.10 because of defined-or operator (//=).