It's a good query.
Currently i can't anser you as i need to take a look into the grep source for that. Currently i'm just greping the file like that :
grep "12345$" myfileSame for the count :
wc -l myfileI have another code in Perl for doing what you are suggesting maybe : it first load all the lines in memory, then grep it. But unfortunatly, the result is worst compared to a line by line try (2,47s vs 8,33s). Here is the code used for this test (on a reduced set, 200 mb)
open (FH, '<', "../Tests/10-million-combos.txt"); print "Loading the file...\n"; while (<FH>) { push (@_file_to_parse, $_); } print "Counting the file...\n"; $NumberOfLine=@_file_to_parse; print "Searching 123456$...\n"; @_result=grep {/123456$/} @_file_to_parse; $NumberOfResult=@_result; print "$NumberOfResult - $NumberOfLine\n"; close FH;
In reply to Re^4: How to optimize a regex on a large file read line by line ?
by John FENDER
in thread How to optimize a regex on a large file read line by line ?
by John FENDER
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |