Re: very slow processing

When you make the first pass, instead of pushing each ID into an array then filtering out duplicates, use a hash with ID as the key. For the value, concatenate your formatted output. For the second pass, loop on the keys of the hash, printing the strings in the hash:

my $lnum = 0;
for my $line (@lines) {
  $line =~ /your regex/;
  my $date = $1;
  my $id = $2;
  my $keyword = $3;
  $urecs{$id} .= "$date,$id,$keyword \n";
}
print $urecs{$urec} for my $urec (keys %urecs);
[download]

Displamer: Not tested.

Comment on Re: very slow processing Download Code

Replies are listed 'Best First'.
Re^2: very slow processing by sandy105 (Scribe) on Aug 20, 2014 at 18:37 UTC
the id's are repeated so need to check for unique'ids and keywords	[reply]
Re^3: very slow processing by RonW (Parson) on Aug 20, 2014 at 23:17 UTC
Hash keys are always unique. In my example, if an ID has already been seen, the new string is appended to the previous content of the value for that ID. I could have written: `for my $line (@lines) { $line =~ /your regex/; if (exists $urecs{$2}) { $urecs{$2} .= "$1,$2,$3\n"; } else { $urecs{$2} = "$1,$2,$3\n"; } }` [download] but that is not necessary because Perl treats appending to an undefined value the same as appending to an empty string. Another thing you could do: `$ids{$2}++` would give you a hash of IDs seen (the keys) and how many (the values) times each was seen (again, no need to check for existence first). As for checking the keywords, I left that out so to focus my example on the use of the hash.	[reply] [d/l] [select]