I'm currently processing large .txt files full of biological data, and in an effort to reduce the size of all resulting, downstream files i'm looking to split everything by chromosome (i.e. 1 file for each chromosome) as opposed to printing everything into single files.
I have a list of chromosome labels, such as 1 2 3 4 5 6 7 8 9 10 that can be supplied to the script and used to generate unique filenames e.g. 1_Sample_Info.txt and 2_Sample_Info.txt. Each line of the input data that is processed contains a chromosome label that ends up in a split array ($F[0]), allowing it to be diverted toward the correct output file.
However, I am struggling to figure out how to best open the files in the first place. As of right now, files are opened with:
open $OUT, '>', "$subDir/$outfile" or die "$!";And printed to with:
print $OUT "Example\n";The number of chromosomes and their labels will often differ, so the files cannot be explicitly specified. Therefore, I think the files may have to be generated within a loop that iterates over the supplied list - but - how can I generate a unique filehandle for each output file to then later use with print statements? Another issue is that the chromosome labels can often be numeric (as above) so if used directly as filehandles or variables, they clash with global variables e.g. $1.
Any suggestions or examples would be greatly appreciated. If at all possible i'd like to avoid use of any non-core modules.
In reply to Opening multiple output files within a loop by TJCooper
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |