in reply to Re^3: How to optimize a regex on a large file read line by line ?
in thread How to optimize a regex on a large file read line by line ?
It's a good query.
Currently i can't anser you as i need to take a look into the grep source for that. Currently i'm just greping the file like that :
grep "12345$" myfileSame for the count :
wc -l myfileI have another code in Perl for doing what you are suggesting maybe : it first load all the lines in memory, then grep it. But unfortunatly, the result is worst compared to a line by line try (2,47s vs 8,33s). Here is the code used for this test (on a reduced set, 200 mb)
open (FH, '<', "../Tests/10-million-combos.txt"); print "Loading the file...\n"; while (<FH>) { push (@_file_to_parse, $_); } print "Counting the file...\n"; $NumberOfLine=@_file_to_parse; print "Searching 123456$...\n"; @_result=grep {/123456$/} @_file_to_parse; $NumberOfResult=@_result; print "$NumberOfResult - $NumberOfLine\n"; close FH;
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^5: How to optimize a regex on a large file read line by line ?
by LanX (Saint) on Apr 16, 2016 at 19:33 UTC | |
|
Re^5: How to optimize a regex on a large file read line by line ? (timing)
by LanX (Saint) on Apr 16, 2016 at 20:51 UTC | |
|
Re^5: How to optimize a regex on a large file read line by line ?
by LanX (Saint) on Apr 16, 2016 at 16:52 UTC | |
by John FENDER (Acolyte) on Apr 16, 2016 at 18:05 UTC |