in reply to Re^4: bioperl newbie's question: simple GFF3 peocessing
in thread bioperl newbie's question: simple GFF3 peocessing

thank you, but I think it might be wasteful given that this is a standard file format supported by bioperl.

I'll bet you a pound to a penny that it will take you longer to write; be far harder to maintain; and run more slowly; than this:

c:\test>perl -F"\t" -ane"$F[8]=~/Gap=/ and print" chromosome_1.gff DDB0232428 . EST_match 15809 17104 . + + . ID=DDB0014588;Target=DDB0014588 1 561;Gap=M405 I786 M104 DDB0232428 . EST_match 66374 66803 . + + . ID=DDB0014789;Target=DDB0014789 1 339;Gap=M96 I111 M222 DDB0232428 . EST_match 117098 117584 . - + . ID=DDB0017340;Target=DDB0017340 1 492;Gap=M486 DDB0232428 . EST_match 122479 123082 . + + . ID=DDB0041612;Target=DDB0041612 1 619;Gap=M603 DDB0232428 . EST_match 162661 163197 . - + . ID=DDB0017341;Target=DDB0017341 1 558;Gap=M536 DDB0232428 . EST_match 162661 162971 . - + . ID=DDB0161652;Target=DDB0161652 1 319;Gap=M310 DDB0232428 . EST_match 162670 163422 . + + . ID=DDB0127927;Target=DDB0127927 1 752;Gap=M752 DDB0232428 . EST_match 162670 163375 . + + . ID=DDB0112861;Target=DDB0112861 1 705;Gap=M705 DDB0232428 . EST_match 162670 163335 . + + . ID=DDB0031935;Target=DDB0031935 1 652;Gap=M26 I18 M621 DDB0232428 . EST_match 162670 163285 . + + . ID=DDB0061852;Target=DDB0061852 1 615;Gap=M615 DDB0232428 . EST_match 162670 163398 . + + . ID=DDB0117238;Target=DDB0117238 1 729;Gap=M728 DDB0232428 . EST_match 162670 163308 . + + . ID=DDB0061789;Target=DDB0061789 1 639;Gap=M638 DDB0232428 . EST_match 162670 163378 . + + . ID=DDB0067313;Target=DDB0067313 1 707;Gap=M708 DDB0232428 . EST_match 162670 163402 . + + . ID=DDB0064238;Target=DDB0064238 1 732;Gap=M732 DDB0232428 . EST_match 162670 163430 . + + . ID=DDB0063928;Target=DDB0063928 1 760;Gap=M760 DDB0232428 . EST_match 162671 163372 . + + . ID=DDB0126764;Target=DDB0126764 1 700;Gap=M701 DDB0232428 . EST_match 162675 163332 . + + . ID=DDB0028393;Target=DDB0028393 1 663;Gap=M657 DDB0232428 . EST_match 162687 163332 . + + . ID=DDB0065179;Target=DDB0065179 1 661;Gap=M645 DDB0232428 . EST_match 162699 163215 . - + . ID=DDB0018629;Target=DDB0018629 1 524;Gap=M516

That's a real GGF file downloaded from the web. It didn't have "locus_tag" tags, so I used "Gap", but it took longer to find the file than parse it.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
RIP an inspiration; A true Folk's Guy

Replies are listed 'Best First'.
Re^6: bioperl newbie's question: simple GFF3 peocessing
by daverave (Scribe) on Jul 31, 2010 at 09:48 UTC
    You're probably right, but this will once again make me skip learning a bit if bioperl.

    I think I already mentioned this in the past, but learning new stuff is something I enjoy, although in the short run it might sometimes take longer than using techniques I already know.

    Thanks for the help though, I do appreciate it.

      Hm. Long ago in my Mech.Eng. days, I spent 4 hours setting up a radial grinder to radius the egde of a push-fit spigot. The instructor watched from the other side of the shop until I got everything perfect and was putting my safety glasses on before coming over. Without saying a word he took the spigot out of the chuck, thereby complete screwing all the set-up work I'd done. I nearly had apoplexy.

      He then walked over to a bench, mounted it in a standard vice, picked up a fine file and radiused the edge with about six gentle strokes of the file. Job done.

      The lesson was that it was the taper on the spigot that required +0-0.0004" accuracy. The radius on the end is simply to stop it from binding in the hole.

      Expend your time wisely, and choose tools that are fit for purpose. Filtering a subset of line records from a text file is a bread&butter, simple Perl problem, regardless of whether it contains last week football scores or top-secret NBC Bio data.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.