comment on

As others said, some sample data would be helpful. But looking at your working-but-slow script, I see that you're looping completely through file2 for every line of file1. That's going to be brutal if file2 is very large. You could speed it up some by at least breaking out of your loop through file2 once you find your match.

Better would be to first read file2 into a hash, with the first field (the one you match your counter against) as the keys, and then check that hash for each line of file1. If file2 is so large that reading it into a hash would present memory problems, you could tie it to a DBM file, and that way the dbm library can put as much of it on disk as necessary.

In reply to Re: Merging larges files by columns by aaron_baugher
in thread Merging larges files by columns by ScottJohn

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.