so sometimes TTGT might come out as TGGT or GGTT might come out as GGTA which are neither of the three barcodes I am after.
If you wanted to add those other barcodes to the list, they'd each get their own file.
I guess with the new script you posted I need to specify 'other' with all possible combination of the four letters that might be in my fastq to make sure that the script doesn't stall.
If you are happy for all the "others" to go inti a single file called 'other.fastQ', use the new version as is.
Come to that, if you wish to simply ignore them, use this version:
#! perl -sw use strict; my %outFHs = map { open my $fh, '>', "$_.fastQ" or die $!; $_ => $fh; } qw[ TTGT GGTT ACCT ]; until( eof() ) { my @lines = map scalar <>, 1 .. 4; my $barcode = substr $lines[1], 0, 9; my $tag = substr $barcode, 3, 4; next unless exists $outFHs{ $tag }; print { $outFHs{ $tag } } @lines; } __END__ usage: thisScript theBigfile.fastQ ## outputs to TTGT.fastQ GGTT.fastQ ACCT.fastQ ## Unrecognised records are ignored
If there are a high proportion of other records, that could speed things up substantially.
In reply to Re^5: Deconvolutinng FastQ files
by BrowserUk
in thread Deconvolutinng FastQ files
by snakebites
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |