Re: FAST way to pull multiple lines around a keyword

if speed is of the essence and only relatively few of your files contain the keywords you are looking for then I would go for a quick'n'dirty check of these keywords without bothering about the surrounding lines. Just save the name of the file in a file and have another script extract the keywords and surrounding lines from the files listed.

Also note that alterations are rather slow in a regexp and by using Regexp::Assemble you may be able to construct a regexp that runs faster.

CountZero

A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Comment on Re: FAST way to pull multiple lines around a keyword

Replies are listed 'Best First'.
Re^2: FAST way to pull multiple lines around a keyword by rizzy (Sexton) on Oct 19, 2010 at 19:10 UTC
Thanks. I think I may do what you suggest. One thing I did which DID speed things up very much was add a line break (/n) at the beginning and end of the match. That way there is only one possible match (for each keyword). What it was doing before was taking every possible combination of characters from the previous line and the subsequent line (i.e., .+) and then picking the longest one. Including the line break explicitly gives it only one choice.	[reply]

Replies are listed 'Best First'.

Re^2: FAST way to pull multiple lines around a keyword
by rizzy (Sexton) on Oct 19, 2010 at 19:10 UTC

Thanks. I think I may do what you suggest.

One thing I did which DID speed things up very much was add a line break (/n) at the beginning and end of the match. That way there is only one possible match (for each keyword). What it was doing before was taking every possible combination of characters from the previous line and the subsequent line (i.e., .+) and then picking the longest one. Including the line break explicitly gives it only one choice.

[reply]