in reply to Re: Alternate for "open"
in thread Alternate for "open"

Ya, Here is the clue. It's hashes. I am building a hash to store all my result values. The output is two excel books (3g data, 2g data) Each contains 23 sheets. Each sheet contain 11,222 (31 columns,362 rows) as an average (Not Exactly). Is it needed 64 Gb RAM??? If it need that much how can we reduce it???

Replies are listed 'Best First'.
Re^3: Alternate for "open"
by 1nickt (Canon) on Nov 17, 2015 at 07:47 UTC

    You've now been given several suggestions:

    • use strict;
    • use warnings;
    • Test by simplifying the script so that it only opens the files
    • Make sure to use while not foreach to read your filehandles
    • Post verbatim snippets of the code here so it can be reviewed
    Which of these have you done?

    In particular, does your code contain use strict; and use warnings;?

    It's not an completely unreasonable idea to try to measure the memory footprint of your hash, but most monks here would not do that to find the problem. Better to simplify your code so you can identify the problem.

    Just remove everything until it runs properly, then start adding stuff back in. If it is a really large and ugly codebase, take the opportunity to refactor and move code out into modules. This is better practise for many reasons and will help you do this kind of debugging by making it easy to use and not use parts of the code.

    You could also:

    • If you suspect the hash is getting too big, comment out the code that populates it, run the program, and see if there's a difference.
    • Try running the program on only one file and see if there's a difference.
    • Try running the program on lots of very small files and see if there's a difference.
    • Consider loading your file data into a real database such as SQLite and working from there.
    • Look for memory leaks with Test::LeakTrace

    There, now you have a bunch more suggestions. It will be nice to hear back from you when you've tried some of them and you are still stuck.

    The way forward always starts with a minimal test.

      While I posted my first program here "MONKS" suggested to use strict. From that time I never miss those two lines in my code. And I am very sure that I am using while to read the file (Already posted).

      As per your suggetion I am trying to run the code piece by piece now. And my first piece is reading CDR & build the values in hash. So, I want to be sure that my hash is not taking much memory.

      Thanks for suggestions

        The only sample you post:

        $cmp=substr(reverse($line),0,1); if ($cmp eq ";") {

        doesn't look like it has strict and warnings switched on. Or if it is, you're reusing $cmp which isn't necessary and recycling vars means you're probably not scoping correctly. That's why you get it as a suggestion again - because we have no evidence to the contrary.

        I don't believe you - if you have use strict; use warnings when you also have:

        my $total_size = total_size(\%val); print "\%val :",$total_size,$/; my $total_size = total_size(\%seen_cdr);

        Your code is generating errors: "my" variable $total_size masks earlier declaration in same scope.

        Either you don't have these switched on, or you've code that's generating errors every time it runs, and you're asking for someone to fix it for you.

Re^3: Alternate for "open"
by Preceptor (Deacon) on Nov 17, 2015 at 11:38 UTC

    Depends how inefficiently you're storing the data - you can _certainly_ incur overheads - for example, XML is around 10x the memory footprint of the file at rest.

    But at least now we've moved on from blaming open - check what you're inserting into the hash. How many key/value pairs? Are you creating nested data structures? (hash of arrays, etc.)? Because all these things add up - you _can_ expect the memory required to be larger than the raw input.