comment on

dave_the_m's advice is spot on. I did have one thought though. If IP-1 and IP-2 are always presented in the same order (any line containing both IP's always presents the same one first), you could set special variable $/ to the string of IP-2. That way, instead of reading lines you'll be reading records ending with the 2nd critical IP address. Then all you have to do is scan said record to see if the 1st critical IP address appears after the nearest preceding newline character. If so, you've got a match.

Why would this be theoretically advantageous? It may (depending on how often IP-2 shows up) result in fewer iterations through the while loop. You're still reading the whole file, but only doing a regexp check if you already know that half of the condition has been met.

When reading a file there is an implicit check happening; behind the scenes perl looks for $/ to end each record. May as well use that implicit 'check' to your advantage.

Of course this adds additional complexity if you actually need to also capture info that comes after that 2nd IP address in the file. At that point, it would be difficult to guess as to whether the additional logic needed to handle that need would negate any minor advantage this path might have in the first place. I guess that means YMMV (Your mileage may vary).

Dave

In reply to Re: Parsing Large Text Files For Performance by davido
in thread Parsing Large Text Files For Performance by bigbot

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.