Re^7: How to optimize a regex on a large file read line by line ?

12,6 min on my side with a newer perl, same distro like yours :

:perl -v

This is perl 5, version 22, subversion 1 (v5.22.1) built for MSWin32-x
+64-multi-thread
(with 1 registered patch, see perl -V for more detail)

Copyright 1987-2015, Larry Wall

Binary build 2201 [299574] provided by ActiveState http://www.ActiveSt
+ate.com
Built Jan  4 2016 12:12:58
[download]

Could you give me your time with this code and the same file (http://mab.to/tbT8VsPDm) perl demo.pl

open (FH, '<', "../Tests/10-million-combos.txt");
$counter=0;
$counter2=0;
while (<FH>) {
    if (/123456$/) {++$counter2;}
}
    print "\n";
print "Num. Line : $. - Occ : $counter2\n";
close FH;
[download]

Thanks.

Comment on Re^7: How to optimize a regex on a large file read line by line ? Select or Download Code

Replies are listed 'Best First'.
Re^8: How to optimize a regex on a large file read line by line ? by poj (Abbot) on Apr 16, 2016 at 20:28 UTC
It took 7 mins with your file. It seems to be related to the line ending not being normal for windows (they are LF only). After I 'processed' your file with this code it took less than 1 minute to scan. `#!perl use strict; my $t0 = time; open FH, '<', "dict.txt" or die "$!"; open OUT,'>','dict1.txt' or die "$!"; while (<FH>) { print OUT $_; } close FH; print time-$t0;` [download] Original Num. Line : 185866729 - Occ : 14900 421 secs Converted Num. Line : 185866729 - Occ : 14900 33 sec	[reply] [d/l]
Re^9: How to optimize a regex on a large file read line by line ? by John FENDER (Acolyte) on Apr 16, 2016 at 22:42 UTC
1mn02, far away better. Thanks !	[reply]