I apologise for rewriting a major portion of your code. I usually try to change as little as possible, but somehow that didn't work here :)
The major approach change is to use the readline-function to read data from the textfile as needed. It seemed like whenever you found a 'gene'-line, you would need to read the next line for 'CDS' or 'exon'-data. You could do this with a flag (as you initially suggested), but why not do simply what you need to do...read the next line immediately?
The other thing I changes was that the data is now stored in a hash-reference (instead of a hash). This is not per se a requirement, but Data::Dump prints hashrefs in an easier to understand way than hashes.
Also, I replaced the data-entries @BMB with $ar_record. It is easier to store lots of records as references instead of arrays.
Lastly, I removed a lot of variable declarations from the stat of the program and put them where they are needed/filled. There is no need to fear a negative performance impact due to initializing variables within a loop. Perl handles this just fine. This will also help you to keep the data in scope (so your main program won't know the 'temporary' variables that were used inside the loop). (I'm not sure I'm explaining this well...)
#!/usr/bin/perl # Task: Extract GeneID-Number and gene information use strict; use warnings; use Data::Dump; my $in; my $hr_data; # 1) open the .gff Inputfile and while reading line by line split $dat +a at each tab and put them in the @array open ($in, '<', "Genomteil.gff") or die $!; while (my $line1 = readline ($in)) { chomp ($line1); # Removes trailing \n my @a_line1 = split ("\t", $line1); if ($a_line1[2] eq 'gene') { if ($a_line1[8] =~ /.*;db_xref=GeneID:(\d+)/) { $GeneID = $1; # We found a GeneID. Create a record (array-reference) to +store with the data from this line my $ar_record = [$a_line1[3], $a_line1[4], $a_line1[6]]; + #the array will be used as values for my hash later + # Also, read the next line from file, which we expect to contain CD +S or exon my $line2 = readline ($in); chomp ($line2); my @a_line2 = split ("\t", $line2); if ($a_line2[2] =~ /CDS|exon/) { + # Alternatively: ($a_line2[2] eq 'CDS' or $a_line2[2] eq 'exon') push (@{$ar_record}, $a_line2[2]); $hr_data->{$GeneID} = $ar_record; } else { print ("Error: next line does not contain CDS or exon +[$.]\n"); next; } } else { print ("Error: 'gene' textblock found, but no GeneID prese +nt at line [$.]\n"); next; } } ## end if ($a_line1[2] eq 'gene') } ## end while (my $line1 = readline...) close $in; Data::Dump::dd($hr_data);

In reply to Re^3: problems with flip flop by Neighbour
in thread problems with flip flop by bio25

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.