Peter Keystrokes has asked for the wisdom of the Perl Monks concerning the following question:

Hello, I am trying to print the contents of my hash to newly created files. The value of the hash is the sequence and the hash key is the ID for the sequence. I was trying to get my script to loop through my hash and create new files with the name of the files being the sequence ID and the contents being the value of the hash. I can't figure out how to do this successfully. Below is my script:
my %id2seq = (); my $id = ''; open File,"human_hg19_circRNAs_putative_spliced_sequence.fa",or die $! +; while(<File>){ chomp; if($_ =~ /^>(.+)/){ $id = $1; }else{ $id2seq{$id} .= $_; } } my $filename = "'$id'"; open (my $fh, '>', $filename) or die "Could not open '$filename' $!"; + foreach $id (keys %id2seq){ print $fh ($id."\n",$id2seq{$id}, "\n"); } close File; close $fh;
Any pointers will be well appreciated.

Replies are listed 'Best First'.
Re: I'm trying to print the contents of a hash to newly created files
by huck (Prior) on May 08, 2017 at 15:23 UTC

    If you want multiple files shouldnt the open/close be inside the foreach loop?

    And  my $filename = "'$id'"; is going to try to open a file with single quotes in its name, i kinda doubt you want that.

      Thank you, sir. I've incorporated your advice into my code in conjunction with advice about how I open my file handle and it seems to be working. Here's how my code looks now:
      my %id2seq = (); my $id = ''; open File,"human_hg19_circRNAs_putative_spliced_sequence.fa",or die $! +; while(<File>){ chomp; if($_ =~ /^>(.+)/){ $id = $1; }else{ $id2seq{$id} .= $_; } } foreach $id (keys %id2seq){ open my $out_fh, '>>', $id or die $!; ##Amendment here print $out_fh ($id."\n",$id2seq{$id}, "\n"); close $out_fh; ## moved into the foreach loop } close File;
Re: I'm trying to print the contents of a hash to newly created files
by vrk (Chaplain) on May 08, 2017 at 15:31 UTC

    I'm wondering why you need the hash in the first place. Would it not be simpler to open, write and close the output file for each ID as you're reading the input file? Then you don't need to keep anything in memory. Something like:

    use strict; use warnings; open my $in_fh, '<', "human...fa" or die $!; my $out_fh; while (defined(my $line = <$in_fh>)) { if ($line =~ /^>(.*)/) { my $id = $1; if ($out_fh) { close $out_fh or die $!; } open $out_fh, '>', $id or die $!; next; } if ($out_fh) { print $out_fh $line; } }

    Note: code above is untested!

    Also, you may be better off capturing only known good characters for the ID to avoid special characters (like shell redirection symbols) in filenames.

      You're assuming that each id only occurs once per file.

      open $out_fh, '>>', $id or die $!;

      … might be better.

        The replacement of $filename with $id was very useful and it is clear to me, but can you explain to me the significance of '>>' as opposed to '>'? My script seems to be working thanks to your input in conjunction with another Perl Monk. Thank you, sir.
Re: I'm trying to print the contents of a hash to newly created files
by perldigious (Priest) on May 08, 2017 at 15:24 UTC

    Hi Peter Keystrokes,

    Could you please provide a sample of your data from your file? Even just a few lines surrounded by code tags would be useful.

    Just another Perl hooker - Working on the corner... corner conditions that is.
      Typical fasta file for DNA/RNA sequences:
      >hsa_circ_0000001|chr1:1080738-1080845-|None|None ATGGGGTTGGGTCAGCCGTGCGGTCAGGTCAGGTCGGCCATGAGGTCAGGTGGGGTCGGCCATGAAGGTG +GTGGGGGTCATGAGGTCACAAGGGGGTCGGCCATGTG >hsa_circ_0000002|chr1:1158623-1159348-|NM_016176|SDF4 GGTGGATGTGAACACTGACCGGAAGATCAGTGCCAAGGAGATGCAGCGCTGGATCATGGAGAAGACGGCC +GAGCACTTCCAGGAGGCCATGGAGGAGAGCAAGACACACTTCCGCGCCGTGGACCCTGACGGGGACGGT +CACGTGTCTTGGGACGAGTATAAGGTGAAGTTTTTGGCGAGTAAAGGCCATAGCGAGAAGGAGGTTGCC +GACGCCATCAGGCTCAACGAGGAACTCAAAGTGGATGAGGAAA