in reply to Re: Speeding up stalled script
in thread Speeding up stalled script

Thank you to all the Monks! I made the mods suggested by you, GrandFather, and by the Monks before you on this node. For the larger datasets there does not appear to be any difference. I am running one such job, and its been 30 minutes already. As you point out, the problem is likely with my algorithm/logic/approach in EXTRACT and START loops not scaling up well...Any thoughts on improving this scale-up for massive files? Some input file datasets can be found at http://bit.ly/1K69JuQ

The command line syntax would be

perl Length_dstrbtn_seq_extractor.pl Ath167_sORF.facleaned-up_ReMapped_v2-5p-flanking.fa Athaliana_167_TAIR10.cds_primaryTranscriptOnly.facleaned-up

perl Length_dstrbtn_seq_extractor.pl Ath167_sORF.facleaned-up_ReMapped_v2-5p-flanking.fa Athaliana_167_intron_ONLY_FASTextract-intronic-seqs.fasta

perl Length_dstrbtn_seq_extractor.pl Ath167_sORF.facleaned-up_ReMapped_v2-3p-flanking.fa Athaliana_167_TAIR10.cds_primaryTranscriptOnly.facleaned-up

perl Length_dstrbtn_seq_extractor.pl Ath167_sORF.facleaned-up_ReMapped_v2-3p-flanking.fa Athaliana_167_intron_ONLY_FASTextract-intronic-seqs.fasta

There are other datasets, some super small, and one set that is still uploading that is very very large

Replies are listed 'Best First'.
Re^3: Speeding up stalled script
by GrandFather (Saint) on Feb 04, 2015 at 05:43 UTC

    As a general thing we would rather see you include the minimum data required to along with your node. Linking to data elsewhere has the issue that the linked data may change, be removed or move. In this case your link seems to be broken.

    If you would like further help with this issue I suggest you mock up a very small data set (no more than a few dozen lines of text) for us to play with. It also helps if you can indicate the type of output expected from running the script against your sample data set so we know what we are aiming at.

    Perl is the programming world's equivalent of English