how to reset the input operater in while loop?

lightoverhead has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: how to reset the input operater in while loop? by GrandFather (Saint) on Sep 30, 2008 at 05:49 UTC
Probably you are trying to do the wrong thing. You should almost never need to reread a file in a nested loop like that. Most likely what you need to do is read file2 into a hash then use hash lookups to check for matches in file1. Something like: `use strict; use warnings; my $data1 = <<DATA; 1 a 2 b 3 c DATA my $data2 = <<DATA; 4 d 1 x DATA my %data2Hash; open my $infile2, '<', \$data2; while (<$infile2>) { chomp; my ($key, $tail) = split ' ', $_, 2; $data2Hash{$key} = $tail; } close $infile2; open my $infile1, '<', \$data1; while (<$infile1>) { chomp; my ($key, $tail) = split ' ', $_, 2; if (exists $data2Hash{$key}) { print "Matched $key: $data2Hash{$key}, $tail\n"; } }` [download] Prints: `Matched 1: x, a` [download] Perl reduces RSI - it saves typing	[reply] [d/l] [select]
Re^2: how to reset the input operater in while loop? by lightoverhead (Pilgrim) on Sep 30, 2008 at 07:06 UTC
Thank you for your answer. I know this is not the right way to do it. Sorry for this confusion. In fact what I was trying to do is to compare each line of file1 with each line of file2, not just match each other. I had two ways to do it,first,open and close file2 for each line for file1, thus I can iterate all the lines of file2 every time. Second, I can build an array to store items of file1 or file2,then iterate them. Using hash should be fine too. But either way (opening/closing file or array/hash) has its own shortcoming. opening/closing file will be slower (right?),array/hash will consume memory. These two files are very very huge files. I recalled that I might read it somewhere that we could reset the position of <> operator, and every time when one round iteration is done, it might be set back to do another round iteration. I am not sure if this method exists. But if there is one like this, I could be able to re-iterate the file without opening/closing files or building arrays/hashes. Thanks.	[reply]
Re^3: how to reset the input operater in while loop? by GrandFather (Saint) on Sep 30, 2008 at 09:34 UTC
The time to open and close the files is completely insignificant compared to the time to read them, especially if the files are large. You really don't want to do it that way! Tell us about the bigger picture. It is almost certain that there is a better way of achieving what you want to do than reparsing one of the files once for each line of another file. It may be that you need to sort the files first, or use a database, or extract key information, but whatever the technique, it will be much faster than the scheme you are currently considering. Perl reduces RSI - it saves typing	[reply]
Re^4: how to reset the input operater in while loop? by graff (Chancellor) on Oct 01, 2008 at 04:13 UTC
Re^3: how to reset the input operater in while loop? by graff (Chancellor) on Oct 01, 2008 at 04:27 UTC
what I was trying to do is to compare each line of file1 with each line of file2, not just match each other. That is not clear. How is "compare" different from "match"? What do each of those terms really mean, for your purposes? Show a couple examples of data from each file, and what sort of output you want with regard to those examples. Might there be duplicate lines within a given file? Do you need to keep track of the particular positions in one or both files when there is a "match" (or some particular result of "comparison"), or will it be enough just to list the data that matches/compares? Do you need to preserve or enforce a particular ordering in your output? If the files are "very huge", then it will be very important to be very clear about what you are really trying to accomplish; having the wrong task in mind, and/or using the wrong approach, can waste a "very huge" amount of time.	[reply]
Re: how to reset the input operater in while loop? by JavaFan (Canon) on Sep 30, 2008 at 07:09 UTC
You mean, you want to `seek` back to the beginning of the file? Assuming the file is seekable (if it's not, you cannot do what you want), you'd use the function `seek` to seek back to the beginning. `perldoc -f seek` will give you the details.	[reply] [d/l] [select]
Re^2: how to reset the input operater in while loop? by AnomalousMonk (Archbishop) on Sep 30, 2008 at 07:51 UTC
See also the recent discussion of seek and 'resetting' a file handle in the thread Having Access to a file two times.	[reply]
Re^2: how to reset the input operater in while loop? by JadeNB (Chaplain) on Sep 30, 2008 at 18:51 UTC
Assuming the file is seekable (if it's not, you cannot do what you want) … Why wouldn't GrandFather's approach of `close`-ing and re-`open`ing work even if the filehandle is non-`seek`able? UPDATE: Thanks to JavaFan for a very polite answer to a very silly question.	[reply] [d/l] [select]
Re^3: how to reset the input operater in while loop? by JavaFan (Canon) on Sep 30, 2008 at 21:24 UTC
Because non-seekable usually means the data can only be read only once. Examples of non-seekable handles are (named) pipes, STDIN (usually) and network sockets.	[reply]