Re: Searching Array To Hold RegEx Stack Is Order Dependant

~~David~~:

Others have already suggested ways to speed your search. I'm just going to throw in a technique I occasionally find useful when dealing with large files. Sometimes you can easily differentiate between different types of data in your file, and take a quick pass over the file, discarding the obviously uninteresting stuff, creating a much smaller extract that fits in memory. You can then process that extract. For example, suppose your file looked something like this:

This header section contains a bunch of left-justified
lines that are all shorter than 60 characters. Only the
headers and summary lines are interesting.

  Col1   Col2   Col3  Notes
  ----   ----   ----  -------------
     1    123    123  Foo bar baz
     2    741    200  Barbaz Foobar
     3    100      0  Boofar fazboo
         Total:  323

Next header section, etc.
[download]

So you could discard all unless they are left-justified and shorter than 60 characters or don't contain 'Total:'. You can then use the faster in-memory techniques to dig through the extract. Something like so:

my @extract;
while (<$INF>) {

    # Ignore useless lines
    next if length($_) > 60;
    next unless /^(\w|\s*Total:)/;

    push @extract, $_;
}

# Now you can process @extract for the interesting stuff
print @extract;
[download]

...roboticus

Time flies like an arrow ... fruit flies like a banana.

--Groucho Marx

Comment on Re: Searching Array To Hold RegEx Stack Is Order Dependant Select or Download Code