http://qs1969.pair.com?node_id=1102897


in reply to Foreach/While - Why does this run AFTER that?

What I don't understand is WHY, since it's in a foreach loop first

The way I read your code, the (second) while loop is inside of the foreach loop. Snipping out the rest, the basic loop structure of your code looks like this:

while ( $offset < length( $pieces ) ) { # do some stuff } # do some more stuff foreach my $p ( @pieces ) { # do some more stuff print "[dothis:] Piece $counted: $p \n"; # do some more stuff while ($n = read (F, $data, $piece_length) != 0) { # yet more stuff print "[dothat] Hash2:" . sha1_hex($data) . "\n\n"; # yet more stuff } }

So for each value of $p in the foreach loop, the inner while loop is going to run until it can't read any more data from F. On the first time through the foreach loop, this will read all the data from F that's available. On subsequent times through the foreach loop, all the data from F have already been read.

Replies are listed 'Best First'.
Re^2: Foreach/While - Why does this run AFTER that?
by Athanasius (Archbishop) on Oct 06, 2014 at 07:22 UTC

    ... in which case the immediate solution to the problem is to:

    seek F, 0, 0;

    immediately before the inner while loop (see seek).

    To the OP:

    But reading the same input file multiple times is a poor design. Reading from disk is order(s) of magnitude more expensive than reading from RAM. Much better to read the file once, storing its data in a suitable data structure, and then iterate over that data structure as needed.

    BTW, some consistent indentation would go a lo-o-ong way towards making the code more readable (and the problem more tractable).

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

      Ahh, thanks.

      I agree that it is poor design, but at the time it is all I could think of. For the example I posted, I was testing it on a 38mb .mp3 file, in which case the CPU use for perl.exe goes to around 42mb while it's in use.

      CPU usage is a big thing to me (I tend to use Firefox or Chrome, and I'm limited to around 2GB of usable memory at any given time. Add to that the fact that Firefox will take up around 300mb, and the system takes up a considerable amount, and I'm not left with a lot), so I am open to suggestions on how to make the coding for efficient.

      Per your suggestion, and please correct me if I am wrong, you are saying that it would take less memory to add the file to an array, and then split it from there? I figured that adding it in at once would still spike the memory usage to whatever size the file was. While that is not a bad thing if the file is >100mb, when you have a 1GB file for example, it's not something that I would want slowing my computer down to a crawl. I probably misunderstood, and am just overlooking a way to do this effectively. If you could point me to any write-ups, or tutorials on ways to do this without causing a giant memory leak, then I would be super grateful :)

      Rather than try to edit the foreach loop around and waste time with it, I ended up erasing it and doing the following:
      seek F,0,0; while ($n = read (F, $data, $piece_length) != 0) { $excount++; $currentpiece = shift(@pieces); $counted++; $currentpiece =~ s/(.)/sprintf("%02x",ord($1))/egs;

      This seems to work out well enough, since I only wanted one piece at a time for a .torrent file, and then it's not needed anymore. And, added bonus, all the hashes match up! I wanted to get this section working before I moved on to a finder section (Using File::Find, I want to be able to go through and find a torrents data directory in a list of subdirectories, then hash check all of the data. If it is complete, I will move the data directories to a specified area for better organization.) https://github.com/thoj/torrentmv-perl/blob/master/torrentmv.pl does a similar job or verifying data, but it doesn't go through multiple directories, has to have a path specified to it on the command line (which could change), and would not run on my Windows7 64bit system without some modifying which caused me to whip this script up.

        Hello CalebH,

        I’m glad seek was helpful! As to the rest: when I said “more expensive” I was thinking of CPU usage only, not memory usage. Sorry I wasn’t clearer. If memory is tight, it’s obviously more economical of memory to read the file progressively than to keep it all in memory at one time. And if that means reading the same file more than once, that may just be the price you have to pay to keep memory usage within acceptable bounds. But it seems your new version avoids that problem anyway.

        Cheers,

        Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,