Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Re: join on 6 huge files

by SageMusings (Beadle)
on Jun 12, 2004 at 03:49 UTC ( #363554=note: print w/replies, xml ) Need Help??

in reply to join on 6 huge files


If I understand your problem correctly, and you were a bit hazy, you simply do not want to work with these huge files in memory. Right?

I would open all the files in a stream fashion, much in the spirit of that old Unix standy "sed". Go through each file line-by-line like you are executing a batch process. The output of each "one line" from each input stream is munged "per-line", not all at once. Then take the resulting concatenation and write to the destination file. It's quick, simple, and elegant.

I have written several tools that use a sed tack. Some of my files are as large as 12MB and it never chokes.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://363554]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (5)
As of 2022-08-18 20:19 GMT
Find Nodes?
    Voting Booth?

    No recent polls found