in reply to Re: foreach vs while<>
in thread foreach vs while<>

My wording is a little vague b/c it's mainly a follow up to the question I linked (I didn't want to hijack that thread).
How can I process large files with while(<>)?
Briefly - he opened a file, read the file handle into an array, and processed the array with a foreach loop. The problem seemed to be that the file was too big to read entirely into an array. So the solution was to go line by line on the file with while<HANDLE>.
I haven't been using perl all that long. I learned to process files with the former method using foreach, and only recently learned about while<HANDLE>. That's why I'm asking when to use which.

Replies are listed 'Best First'.
Re^3: foreach vs while<>
by ikegami (Patriarch) on Nov 12, 2008 at 06:56 UTC

    If you want a line at a time, while (<$fh>) is perfect.

    An array would be useful if you need to go through the file multiple times. Or if you need to traverse the file in an odd order. Or if you want to sort the contents. Sometimes, I'll load the entire file into memory so I can use grep or map, although I tend not to use an array in between.

    And then there times when you want to load the entire file into a string, so you'd use local $/; $text = <$fh>;.

    And then there are files which don't have a concept of lines, so you'd use read or sysread.

    One this is for certain is that there is no reason to do @a = <$fh>; for (@a) unless you do something else with @a. That's the same thing as doing for (<$fh>), and there's no reason for that.

      One this is for certain is that there is no reason to do @a = <$fh>; for (@a) unless you do something else with @a.
      That's not true. There's an advantage
      @a = <$fh>; for (@a) { .. }
      gives you which
      while (<$fh>) { .. }
      doesn't give you which may be useful sometimes. Doing @a = <$fh> means you're done reading the file before processing it. If you have a lock on the file, you can then release it. But if you are doing
      while (<$fh>) { .. }
      you're only done reading the file after you've processed almost everything (and one typically releases a lock after processing everything). Now, most of the time, this doesn't matter because noone else needs the file (so you don't have locks, and the file doesn't get modified), or the processing goes fast (so there isn't much difference).

      But it goes to far to say "there's no reason to do @a = <$fh>; for (@a)". Sometimes, there is.

        If you have a lock on the file, you can then release it.

        I had considered that possibility. That would be

        @a = <$fh>; close $fh; for (@a) { .. }

        There are reasons to load the file into an array. That's one of them, and I listed four others. But that's not the code the OP posted, and that's not the code I proclaimed should be avoided. You did not contradict me, or do you think I've already contradicted myself four times?

        A reply falls below the community's threshold of quality. You may see it by logging in.