What could
$domchain =~/\S/g;
$pdbchain =~/\S/g;
chomp($domchain);
chomp($pdbchain);
possibly accomplish?
I tried to fix some things in the code below, but, naturally, I could't try to run it (as I don't have the files).
use strict;
use warnings FATAL => 'all';
use autodie;
my $dir = "<my directory>";
my ( $dom_file_name, $csa_file_name ) = @ARGV;
my %csa_hash;
my %dom_hash;
open my $csa_file, '<', $csa_file_name;
while ( my $csa = <$csa_file> ) {
my @csa_data = split( /\,/, $csa );
$csa_hash{ $csa_data[0] . $csa_data[3] } = $csa_data[4];
}
close($csa_file);
open my $dom_file, '<', $dom_file_name;
while ( my $dom = <$dom_file> ) {
my @dom_data = split /\s+/, $dom;
my $domain = $dom_data[0];
next if exists $dom_hash{$domain};
open my $ind_dom_file, '<', $dir . $domain;
while ( my $line = <$ind_dom_file> ) {
chomp $line;
my @ff = split /\s+/, $line;
my $number = $ff[5];
my $CA = $ff[2];
push @{ $dom_hash{$domain} }, $number if $CA eq 'CA';
}
close $ind_dom_file;
}
close $dom_file;
while ( my ( $protein_dom, $residues_ref ) = each %dom_hash ) {
for my $residue ( @{$residues_ref} ) {
my $dom_chain = substr( $protein_dom, 0, 5 );
if ( $residue eq $csa_hash{$dom_chain} ) {
print "$dom_chain $residue\n";
}
}
}
The idea was that it does the same thing as your code, but faster.