in reply to parsing large files for something
The following may be what you are looking for, although this allows the match and condition key words to be on the same line. If that's not a requirement the code could be simplified somewhat.
#!/usr/bin/perl use strict; use warnings; my %matches = (I => {cond => 'megabyte'}, code => {cond => 'I'},); my $match = "\\b" . join("\\b|\\b", keys %matches) . "\\b"; my %conds; my $condMatch; while (defined(my $line = <DATA>)) { my @segments = split /(?=$match)/, $line; for my $segment (@segments) { while ($segment =~ /($match)/g) { my $cond = $matches{$1}{cond}; if (exists $conds{$cond}) { delete $conds{$cond}; } else { $conds{$cond} = $line; } $condMatch = join "\\b|\\b", keys %conds; $condMatch = "\\b$condMatch\\b" if $condMatch; } if ($condMatch && $segment =~ /($condMatch)/) { print $conds{$1}; delete $conds{$1}; } } } __DATA__ Suppose I want to find something, that is present multiple times in a +large (several hundred megabyte) file. However, I only want it if something +else exists after it but before the next occurrence of the thing I am looki +ng for. For instance the thing I am looking for might be a set of code numbers + and the condition that I will use to decide whether I want the code may be 1 o +r more lines further into the file but before the next occurrence of a code n +umber. What is the best (most elegant) way to approach this problem? Chet
Prints:
Suppose I want to find something, that is present multiple times in a +large For instance the thing I am looking for might be a set of code numbers + and the
|
|---|