in reply to Re^3: An overlapping regex capture
in thread An overlapping regex capture

Okay, I fixed up the items you suggested I should, and so my script is looking like this:
#!/usr/bin/perl use strict; use warnings; open my $fh, '<',"human_hg19_circRNAs_putative_spliced_sequence.fa",or + die $!; my %id2seq; while(<$fh>){ my $id = ''; chomp; if($_ =~ /^>(.+)/){ $id = $1; }else{ $id2seq{$id} .= $_; } } foreach my $id (keys %id2seq){ my $filename = (split /\|/, $id)[0]; open my $out_fh, '>>', "$filename" or die $!; print $out_fh ">".$id."\n",$id2seq{$id}, "\n"; close $out_fh; } close $fh;

How do I integrate the value I've split and extracted into the naming of the file, because it's stating that it's uninitialised?

Although, I thought that it was clearly initialised/defined here:

my $filename = (split /\|/, $id)[0]; open my $out_fh, '>>', "$filename" or die $!;

Or maybe I'm just misunderstanding the scope? Where do I place the $filename in the loop?

Pete.

Replies are listed 'Best First'.
Re^5: An overlapping regex capture
by poj (Abbot) on Jun 22, 2017 at 15:33 UTC

    The problem is earlier where $id is set for the > lines but then cleared on the subsequent sequence lines

    while(<$fh>){ my $id = ''; chomp; if ($_ =~ /^>(.+)/){ $id = $1; } else { $id2seq{$id} .= $_; } }

    Try

    #!/usr/bin/perl use strict; use warnings; my $id; my %id2seq; my $infile = 'human_hg19_circRNAs_putative_spliced_sequence.fa'; open my $fh,'<',$infile or die "Could not open $infile : $!"; while (<$fh>){ if ( /^>(.+)/ ){ $id = (split /\|/, $1)[0]; } $id2seq{$id} .= $_; } foreach my $id (keys %id2seq){ my $filename = $id.'.fa'; print "Creating $filename\n"; open my $out_fh,'>', $filename or die "Could not open $filename : $!"; print $out_fh $id2seq{$id}; close $out_fh; } close $fh;
    poj
Re^5: An overlapping regex capture
by 1nickt (Canon) on Jun 22, 2017 at 12:01 UTC

    Did you try printing the values of your variables as I suggested?


    The way forward always starts with a minimal test.

      Yes, I tried it and got the following:

      ID: at seqextractor.pl line 23, <$fh> line 130.

      Segments: $VAR1 = [];

      From the using the following

      foreach my $id (keys %id2seq){ warn "ID: $id"; my @segments = split /\|/, $id; warn "Segments: " . Dumper \@segments; my $filename = $segments[0]; }

      Does it mean that the array is empty?

        Yes, that's right. And can you also see why the array is empty, from your debug statements?


        The way forward always starts with a minimal test.