in reply to Re^5: When the input file is huge !!!
in thread When the input file is huge !!!
Writing pre and post filters that convert from/to FASTA/single line records isn't hard, and is, (can be, so long as you don't use Bio::*), relatively fast.
The problem then is that some of the sequences can be so long, that some system sort utilties can not handle the line length. Sad but true.
Doing a sort in Perl--pure Perl--that goes beyond a few 10s of millions of records is a complete waste of time. It requires so much memory per item, that it almost always results in either swapping or 'Out of memory'.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^7: When the input file is huge !!!
by tilly (Archbishop) on Jan 07, 2009 at 21:08 UTC |