Re: Re: Re: Optimizing PDB data structures

Ah, I see your goal now. I have two answers to your problem.

The first is to simply ignore this possible speed optimization. If you are picking one of two chains, use

foreach my $ref (keys %$self){
   next unless $self->{$ref}{'chain'} = 1;
   # process chain 1 atoms 
}
[download]

The cost of looping and one nested dereference is probably negligible compared with the other processing you need to do, so don't waste your time on it until you have verified that this is a bottleneck and that the slowdown matters to you.

If the bottleneck is a real problem, you will have to promote the variables you will subset on and create a more ugly data structure:

foreach my $ref (keys %$data){
   next unless $data->{$ref}->type eq 'ATOM';
   my $atom = $data->{$ref}->atom;
   $self->{$atom->chainId}{$ref}{atoms} = $atom;
   $self->{$atom->chainId}{$ref}{'residues'} = $atom->resNumber;
}
# ...
foreach my $ref (keys %{$self->{1}}) {
   # process chain 1 atoms
}
[download]

With an extra dereference per atom, I am not convinced that this will be noticably faster.

-Mark

Comment on Re: Re: Re: Optimizing PDB data structures Select or Download Code