Nastazia has asked for the wisdom of the Perl Monks concerning the following question:
Hello everyone, I am trying to write a script in perl which will do the following
it will read a pdb file that contains only Ca atoms as the following
1 2 3 4 5 6 ATOM 1 CA PRO A 889 84.370 72.820 26.830 1.00 0.00 + ATOM 2 CA THR A 890 87.370 73.900 28.080 1.00 0.00 + ATOM 3 CA VAL A 891 90.920 72.490 27.750 1.00 0.00 + ATOM 4 CA PHE A 892 93.640 74.890 28.970 1.00 0.00 + ATOM 5 CA HIS B 893 97.060 74.200 27.360 1.00 0.00 + ATOM 6 CA LYS B 894 99.880 73.920 29.990 1.00 0.00
it will read a second pdb that contains every atom
1 2 3 4 5 6 ATOM 1 N PRO A 889 16.220 12.185 1.804 1.00 71.54 + N ATOM 2 CA PRO A 889 16.101 12.990 3.034 1.00 70.89 + C ATOM 3 C PRO A 889 15.432 14.346 2.803 1.00 72.31 + C ATOM 4 O PRO A 889 14.743 14.852 3.703 1.00 72.20 + O ATOM 5 CB PRO A 889 17.553 13.151 3.502 1.00 72.96 + C ATOM 6 CG PRO A 889 18.315 12.067 2.782 1.00 78.00 + C ATOM 7 CD PRO A 889 17.626 11.907 1.465 1.00 73.35 + C
(The files refer to the same molecule but have different number of lines)
So if the residue number (column num 5) is the same it will take the chain letter (column num 4) from the first file and replace all the chain letters that have the same residue number in the second file. So far i've got this disaster :/
print "\nEnter the network pdb file file: "; $inputFile = <STDIN>; chomp $inputFile; unless (open(INPUTFILE, $inputFile)) { print "Cannot read from '$inputFile'"; <STDIN>; exit; } # load the file into an array chomp(@networkpdb = <INPUTFILE>); # close the file close(INPUTFILE); print "\nEnter the pdb output file: "; $inputFile2 = <STDIN>; chomp $inputFile2; unless (open(INPUTFILE, $inputFile2)) { print "Cannot read from '$inputFile2'"; <STDIN>; exit; } chomp(@pdb = <INPUTFILE>); close(INPUTFILE); for ($line1 = 0; $line1 < scalar @networkpdb; $line1++) { if ($networkpdb[$line1] =~ m/ATOM\s+\d+\s+\w+\s+\w{3}\s*(\w+)\s*(\ +d*)\s+\S+\.\S+\s+\S+\.\S+\s+\S+\.\S+\s+.+\..+\..*/ig) { my $resnum=$2; my $chain=$1; for ($line = 0; $line < scalar @pdb; $line++) { if ($pdb[$line]=~ m/(ATOM\s+\d+\s+\w+\s+\w{3}\s*)(\w+)\s*(\d*)(\s ++\S+\.\S+\s+\S+\.\S+\s+\S+\.\S+\s+.+\..+\..*)/ig) { my $begining=$1; my $resnum1=$3; my $chain1=$2; my $end=$4; if ($resnum1=$resnum) {$chain1=$chain; $parsedData{$line} = $begining.$chain1."\s".$resnum1.$end; }}}}} # create the output file name $outputFile = "WithNetwork_".$inputFile; # open the output file open (OUTFILE, ">$outputFile"); # print the data lines foreach $line (sort {$a <=> $b} keys %parsedData) { print OUTFILE $parsedData{$line}."\n"; } # close the output file close (OUTFILE);
thank you very much in advance
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Perl script that will read two pdb files with different line numbers and will replace the chain letter from the first to the second file
by hippo (Archbishop) on Jun 22, 2018 at 10:18 UTC | |
|
Re: Perl script that will read two pdb files with different line numbers and will replace the chain letter from the first to the second file
by Laurent_R (Canon) on Jun 22, 2018 at 12:09 UTC | |
|
Re: Perl script that will read two pdb files with different line numbers and will replace the chain letter from the first to the second file
by talexb (Chancellor) on Jun 22, 2018 at 14:25 UTC |