Thanks! I can get your two first points by simply downloading the files from where I got the .gbff (i.e. this link has the .gbff, .gff, and .faa files for one of the organisms), though I can see that using genbank2gff3.pl is probably easier/faster but I'm not sure the output will be the same. I believe BioPerl::SeqIO already takes into account the reverse complement there, but I'll make sure.

More importantly, if those .gbff don't have the sequences as in my examples, extracting them will not be possible anyway.

I am wondering why many of these files are wrongly deposited in the first place. And if they are, why isn't this automatically corrected by NCBI itself... I was expecting this to be much easier than it is proving to be. but I guess this isn't the place to rant about this eh :)

Off topic:If you find an easier way to get the CDS and the protein sequences please let me know. Even if it involves not using Genbank, as long as I can use NCBI's FTP everything is fine...


In reply to Re^2: Debugging Bioperl warnings for Genebank files that are missing info by Sosi
in thread Debugging Bioperl warnings for Genebank files that are missing info by Sosi

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.