Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

How to work on a large Data file line by line

by Anonymous Monk
on Jul 07, 2008 at 11:11 UTC ( [id://695951]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I am trying to work on Data file line by line. Is it possible to work on the input file without actually opening it. I need it because size of input file is of order GBs. Please Help. Regards Adi
  • Comment on How to work on a large Data file line by line

Replies are listed 'Best First'.
Re: How to work on a large Data file line by line
by MidLifeXis (Monsignor) on Jul 08, 2008 at 15:03 UTC

    While you can open it, unless your perl is compiled to handle large files, you may have problems reading it. Search for "large files" here and see what surfaces.

    Update: This moldy oldie seems to be spot on.

    --MidLifeXis

Re: How to work on a large Data file line by line
by Anonymous Monk on Jul 08, 2008 at 12:47 UTC
    You can open the file without having all (however many) GB's read into memory. Opening the file just gives you access to it; what you do from that point is up to you. You can read in a file line by line this way:
    open FOO, "<my_hug_file.dat"; while (<FOO>) { # reads one line of the file into $_ # do something } close FOO;
      I agree with AM, I've found that opening up and reading the file line by line (and operating on a line, then reading in the next line) works well in this case. However, this typically is only useful if all the data you are working with is in that line. If the data is spread across more lines, that's still not as big of a problem. However, I'd go with MidLifeXis' suggestion below if you need to act on the whole file at once. FWIW.

      tubaandy
Re: How to work on a large Data file line by line
by washer (Initiate) on Jul 09, 2008 at 20:13 UTC
    I'm not sure what you mean by "work on the input file", but in general, no.. you cannot read/write/anything a file without opening it (well, ok.. you can do file tests and stat without opening, but I don't think that's what you are talking about) If you can explain a bit more about what you are trying to do and why you see it as a problem to open a large file, we might be able to provide a more complete answer
Re: How to work on a large Data file line by line
by sundialsvc4 (Abbot) on Jul 08, 2008 at 18:27 UTC

    The files that you access can be arbitrarily large:   you pay no performance-penalty no matter how large the file may be... provided that you remain fully-aware of just what you are asking the computer to do! You need to remain mindful of just how your Perl-code translates to requests that are issued to the operating system.

    For example, “reading the file into memory” is not possible...

    The operating-system will handle many details for you. For example, it will probably figure out on-its-own that you are reading the file “sequentially,” and it will buffer large amounts of data in-advance of your anticipated need to read it. Perl will dole-out the data to you “one line at a time,” but it's actually carving those lines out of a rather large buffer.

    As noted in the referenced post (from ~2001), files larger than 2GB might be problematic in some older systems, since a “file position” was at one time represented by a 32-bit signed integer, but that constraint is probably long-gone for you by now.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://695951]
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (4)
As of 2024-03-29 16:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found