sweeks has asked for the wisdom of the Perl Monks concerning the following question:

Hey there. I am just currently coming to terms with learning the perl programming software and am trying to write a program for my physics degree. Within my program i wish to read two sets of data files and amalgomate them into one depending upon matching data. I belive it is only possible to open one file and read it into an array. Does anybody know how i can have both files open and then use the 'IF x == y' command to write this matching data into a new file. Help for this new boy would be much appreciated.

janitored by ybiC: Retitle from "New Starter" for descriptiveness and searchability

Replies are listed 'Best First'.
Re: Reading from two files at once
by Vautrin (Hermit) on Feb 13, 2004 at 14:50 UTC

    You can have as many files or file handles open as you can think up names of variables for. I would suggest not slurping entire files into memory if there is a possibility they can be very large. I would sugest using a while loop, i.e.:

    use strict; use warnings; # take the files as command line arguments # @ARGV contains all arguments passed in my $first_file = shift (@ARGV); my $second_file = shift (@ARGV); # double check that the file exists # that they can be read # and that they are text unless ((-e $first_file) and (-r $first_file) and (-T $first_file) and (-e $second_file) and (-r $second_file) and (-T $second_file)) { die ("Can't open one of the files!"); } open("FIRST", "< $first_file") or die ("Can't open $first_file because $!"); open("SECOND", "< $second_file") or die ("Can't open $second_file because $!"); my $linenumber = 0; while (1) { # we can use an infinite loop because # we have a last in here... # I could have compressed everything into the # conditional () but that makes it harder # to understand what is going on my $line1 = <FIRST>; my $line2 = <SECOND>; last unless ($line1 and $line2); $linenumber++; if ($line1 eq $line2) { print "$linenumber matches on each file!"; # note we used eq # == is for numerical values # eq is for strings } } print "We processed $linenumber lines\n";

    Note that in the above example I didn't use if ($line1 == $line2), but used if ($line1 eq $line2) The reason for that is that == is numerical equality and eq is string equality. Also note that you're no longer slurping up entire files -- reading them into arrays. This is important if you don't have to do it because large enough files will cause your box to gnash and grind its swap space.

    Also note that if your program is called like:

    % myscript.pl foo bar

    @ARGV will contain: ("foo", "bar"). Also note that -e followed by a filename checks if a file exists, -r followed by a filename checks if it is readable by the effective UID/GID (-R checks for readability by the real UID/GID), and -T checks if the file is a text file

    Update: Thanks Limbic~Region for pointing out that you can only have as many file handles open as your system allows. According to Limbic~Region , this can be 256 on some combinations of Perl / Solaris.


    Want to support the EFF and FSF by buying cool stuff? Click here.
Re: Reading from two files at once
by ChrisR (Hermit) on Feb 13, 2004 at 14:49 UTC
    It's no problem to have multiple files open at the same time. Just use a different filehandle:
    open(FILEa,"/path/to/file"); open(FILE2,"/path/to/another/file");
    I'm not sure about iterating through two files at once though. Perhaps a more experienced monk knows of a way to do that. Without seeing some sample data or the criteria on which you will match records, it's hard to offer much help. However, I would probably use a hash instead of an array. Hashes are much better forthat type of comparison.

    Show us some sample data nd what you have tried so far, and I'm sure you will get more help.

    By the way, welcome to The Monastery.
Re: Reading from two files at once
by EvdB (Deacon) on Feb 13, 2004 at 15:02 UTC
    Not related to your question, more rather your field: I once wrote a neutralino detector simulator in perl for my physics degree...

    This is some advance ammo for use against skeptics on your course. When someone complains that perl is too slow to use look at Inline::C or similar. Also don't try to work out which bits of the code to optimise up front, instead spend your time thinking about algorithms and then find your slow spots using something like Devel::Profile.

    Perl is really excellent for modelling and simulations because it is fast to develop in and flexible. This is important as the chances are that any simulation may well take hours to run, so you will run them overnight. Perl might be slower, but does it matter that the simulation finished at 3am rather than 1am?

    And finally, for when you venture off into MathsLand, don't forget to look into PDL. Good Luck, and don't forget to ask here if you get stuck.

    --tidiness is the memory loss of environmental mnemonics

      "I once wrote a neutralino detector simulator in perl for my physics degree..."

      Is that the same as a neutrino? Or is that like a neutralectron :)

      Thanks for reminding me about PDL. flyingmoose needs to play with PDL too. There are just too many hardcore modules on my list that I need to play with in my space time. Math rocks.

      off-topic: I promise not to write any Perl Maple bindings anytime soon. Anybody else have to use this anal-retentive "language" in college? We had to use this beast all the way through Calculus III. No, flux integrals through fifth dimensional hypersolids cannot be graphed :) I've had some exposure to Matlab and very little to Mathematica, but none of them seemed as remotely annoying as Maple.

        Yep, neutralinos - look, there are even pictures out there.

        --tidiness is the memory loss of environmental mnemonics

Re: Reading from two files at once
by flyingmoose (Priest) on Feb 13, 2004 at 15:00 UTC

    Tackling programming for the first time for a final project might be a little daunting, I agree.

    I have no problem answering newbie questions, but this is one that is covered in *all* of the introductory texts -- and usually really early on. It's in the kind of answer you can find in 15 seconds by looking up a term in the index of a book.

    May I recommend O'Reilly's "Learning Perl" and "Programming Perl", the llama and camel books, respectively. If you are to write a decent program, you need to have some initiative to learn the obvious things by yourself. Really, they are both excellent texts.

    If you get a book, read it, and work some of the examples, you'll write a much better program. Then, when you get questions that can't be found by looking up a term in the index of the book, ask for help here, and we'll be glad to help further. You can finish your program by getting outside help, maybe, but you won't fully understand it unless you take some initiative.

Re: Reading from two files at once
by Abigail-II (Bishop) on Feb 13, 2004 at 17:07 UTC
    I belive it is only possible to open one file and read it into an array.
    Stop right there. Why is it that so many bad programmers always read in files into arrays? Who is teaching them that? Which books promote this? Who is giving the courses that do this?

    The need to read in a file into an array is uncommon. Often, you go through a file line-by-line, no need to keep more than a single line in memory. Sometimes, you want to entire file in memory - but even then, in most cases, you do not want an array. You'd slurp it into a single string.

    Does anybody know how i can have both files open and then use the 'IF x == y' command to write this matching data into a new file.
    You can open multiple files by using the open command multiple times. But I haven't the faintest idea how you would use IF x == y (which doesn't look like Perl syntax) to write to a file. I usually use print to write to a file.

    Abigail