And perhaps a little faster with an RE instead of splits.
use strict; use warnings; my $name = ''; my %id_list; open (STDIN,"perl first_program.pl|") || die "$!"; open (OUT, ">", "out.txt") || die "$!"; while( my $line_in = <STDIN> ) { line_in =~ m/^([^\t]*)(?:[^\|]*\|){3}([^\|]*)/; unless($name eq $1) { %id_list = (); $name = $1; } next if $id_list{$2}++; print OUT "$line_in"; }
Although the parsing of the line is quicker with the RE, the difference may be very small compared to the I/O time. The benefit will depend very much on how often the loop prints.
use strict; use warnings; use Benchmark; my $line_in = "mmenr\thh|gg|kk|3445|uu|zzz\t234\twwe\twe\tqw\t233\n"; Benchmark::cmpthese( 1000000, { 'split' => sub { chomp($line_in); my @line_array = split(/\t/,$line_in); my @subline_array = split(/\|/, $line_array[1]); }, 're' => sub { my ($name, $id) = ($line_in =~ m/^([^\t]*)(?:[^\|]*\|){3}([^\|]* +)/); } }); __END__ Rate split re split 84962/s -- -68% re 266667/s 214% --
In reply to Re^2: slow parser how to make it faster
by ig
in thread slow parser how to make it faster
by baxy77bax
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |