The title, interaction_site and c_end are fixed element( appear in every record). The rest are optional.Input files TITLE An excitatory scorpion toxin with a distinctive feature: an additional alpha helix at the C terminus and its implicati +ons for interaction with insect sodium channels /interaction_site="Q8, N9, Y10, N11, C12, F17, W38, R58, +V59 and K62 form the putative bioactive surface in mature toxin (Zilberberg et al., 1997)." /channel="Sodium channel" /target_cell="Insect specific (Excitatory)" /c_end="Free" // TITLE Cloning and Sequencing of an Excitatory Insect-Selective Neu +rotoxin BmKIT cDNA from Buthus martensii Karsch /interaction_site="Sequential deletions of C-terminal resi +dues suggested Ile73 and Ile74 for toxicity. {Oren et al., 1999}" /channel="Sodium channel" /c_end="Free" // Output References:TITLE "An excitatory scorpion toxin with a distinct +ive feature: an additional alpha helix at the C terminus and its impl +ications for interaction with insect sodium channels" Interaction_site "Q8, N9, Y10, N11, C12, F17, W38, R58, V59 and K62 f +orm the putative bioactive surface in mature toxin (Zilberberg et al. +, 1997)." Channel "Sodium channel" Target_cell "Insect specific (Excitatory)" C_end "Free" References:TITLE "Cloning and Sequencing of an Excitatory Inse +ct-Selecti ve Neurotoxin BmKIT cDNA from Buthus martensii Karsch" Interaction_site "Sequential deletions of C-terminal residues suggest +ed Ile73 and Ile74 for toxicity. {Oren et al., 1999}" Channel "Sodium channel" C_end "Free"
The last code preceded by #### is the one that need to be modified. If I use the code in that location it will only print the interaction site if there are more than one lines of site.compile : perl prog.pl input.db result #! /usr/local/bin/perl -w #initialize all the variable, initialize flags to 0 and line to '' my $counter=1; my $file1="$ARGV[0]"; my $result=">".$ARGV[1]; my $site=''; my $titleline=''; my $siteflag=0; my $titleflag=0; open(INFO1,$file1) or die "Can't open $file1.\n"; #open file1 open(OUT,$result) or die "Can't open $result.\n"; #open result #the input files has a separator :\r\n in each line foreach(<INFO1>) { if(/\s*TITLE\s*(.*)\r/){ ######## check the title $titleflag=1; $titleline=$1; } elsif(/\s*\/interaction_site=(.*)\r/){ ######## handle the title print OUT qq(References:TITLE\t "$titleline"\n); $titleflag=0; $titleline=''; ######## check the site $site=$1; $siteflag=1; } elsif(/\s*(.*)\r/ && $titleflag==1){ $titleline.=" "; # add a white space $titleline.=$1; #concatenate the title with previous line } elsif(/\s*\/channel=(.*)\r/){ if(check2($1)){ print OUT "Channel\t $1\n"; } } elsif(/\s*\/target_cell=(.*)\r/){ if(check2($1)){ print OUT "Target_cell\t $1\n"; } } elsif(/\s*\/c_end=(.*)\r/){ ######## handle interaction site $siteflag=0; $site=''; ######## check c_end if(check2($1)){ print OUT "C_end\t $1\n"; }# end if }#end elsif ####elsif(/\s*(.*)\r && $siteflag==1){ #### $site.=" "; # add a white space #### $site.=$1; #concatenatewith previous site #### print "Site $site\n"; #### } } # end foreach sub check2 { #check whether item = empty quotes if($1 =~ /" "/){ return 0;} else{ return 1;} }
In reply to about where to check the flag by gdnew
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |