in reply to script optmization
Loading 200 MB into available memory is likely possible on today's hardward. If so, the following runs ~ 7 times faster. The idea here is iteratating over seq one time. For larger data files, one can chunk 300 MB at a time time and not forgetting to read till the end of line to have a complete chunk. Then process the chunk similarly.
use strict; use warnings; use autodie; open Newfile, ">", "./Newfile.txt" or die "Cannot create Newfile.txt"; my ($f1, $f2, @seq) = ('seq.txt', 'mytext.txt'); open(my $fh, $f1); foreach (<$fh>) { chomp; s/^\s+|\s+$//g; push @seq, $_; } close $fh; @seq = sort bylen @seq; # need to sort @seq by length. my $data; { open($fh, $f2); local $/; $data = <$fh>; } foreach my $r (@seq) { my $t = $r; $t =~ s/\h+/bbb/g; $data =~ s/$r/$t/g; } print Newfile $data; close Newfile ; exit 0; sub bylen { length($b) <=> length($a); }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: script optmization
by Anonymous Monk on May 14, 2017 at 22:04 UTC | |
by Anonymous Monk on May 14, 2017 at 23:00 UTC | |
by Anonymous Monk on May 14, 2017 at 23:27 UTC |