comment on

Oh, I have some of it, I was gonna get into normalization and polishing my code later... that note appears when there is more than one hit, but I can clean it up since it would be obvious that there's actually more than one hit, hence it is needless ...

Here's my so-messed-up code which I would tend to after having figured with the wiser monks a way addressing my original query..

#!/usr/local/bin/perl
use strict;
use warnings;

my %RNACounts;
my %hash;        #Gene Info
my @snoRNA;
my @exonNumbers;
my @geneID;
my @productID;
my @geneNames;
my @references;
my(@queries, @subjects);

open (FH,'<',"F:/Bioinformatics_NCBI/20MARCH_10/PERL Analysis/test.txt
+") or die("$!\n");
open(FO, '>',"F:/Bioinformatics_NCBI/20MARCH_10/PERL Analysis/testOut.
+txt") or die ("$!\n");   #TESTING
while(<FH>){
        chomp;
        if(/(?=^\d+$)/../(?=http:.*)\n/){
               # s/\W+\n+!\W+//;
               next unless /(\w+ |\| | \n+)/x;  #except for words | pi
+pes | \n
                print FO $_, "\n" ;
        }
        if(/snoRNA(\s+|\d+)[\s|-|\d]/){     #snoRNA
        my $name = $_;
        push @snoRNA, $name;
                }
         if(/^\d+$/){         #exon Numbers
                my $number = $_;
                push @exonNumbers, $number;
                }
                if(/^GI:\d+[\.\d+]/){     #gene Names
                my $name = $_;
                push @geneID , $name;
                }
        if(/^NM_\d+[\.\d+]/){                #gene product ID
                my $name = $_;
                $name =~ s/\s+$//; #substitute the trailing blanks..
                push @productID, $name;
                }
        if(/homo sapiens[\w+\W+]/i){      #gene name, Need MultiLine s
+upport..
                my $name = $_;
                push @geneNames, $name;
                }
        if(/http:.*/){                  #web refs, need multiline supp
+ort..
                my $name = $_;
                push @references, $name;
                }
       # if(/^(?=snoRNA).*(\n^|Query|Sbjct)(?=homo sapiens)/i){}

        if(/^Query(\s+)\d+\s+[agtc]/i){       #Prepare the query and s
+ubject arrays
                my $queryName = $_;                #compare, measure, 
+span, note gaps
                $queryName =~ s/$1//;
                push @queries, $queryName;
                }
        if(/^sbjct(\s+)\d+\s+[agtc]/i){
                        my $sbjctName =  $_;
                        $sbjctName =~ s/$1//;
                        push @subjects, $sbjctName;
                        }
        ##my @array = split /^\d+$/;
       # #print "@array\n";

        }

                                ####GENERATING THE HASHES#####
                                #CREATE A HASH WITH THE snoRNAs AS THE
+ keys
                                

foreach my $element (@snoRNA){
        #print "$element\n";
        my $i =0;
        $hash{$element}="VAL";    #TEST
        }





use Data::Dumper;
print Dumper(\%hash);
#print Dumper(\@exonNumbers),$/;
#print Dumper(\@geneID),$/;
#print Dumper(\@productID),$/;
#print Dumper(\@snoRNA),$/;
#print Dumper(\@geneNames),$/;
#print Dumper(\@references),$/;
#print Dumper(\@queries),$/;
#print Dumper(\@subjects),$/;
[download]

Excellence is an Endeavor of Persistence. Chance Favors a Prepared Mind.

In reply to Re^2: split a file into records and process it by biohisham
in thread split a file into records and process it by biohisham

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.