Re: A program to extract the reads and modify the seq ID

G'day Teju,

Welcome to the monastery.

I think this is closer to what you want:

#!/usr/bin/env perl

use strict;
use warnings;
use autodie;

open my $id_fh, '<', 'sample_ID.txt';
my %ids = map { s/^>//; $_ } grep { length } split /\s+/ => do { local
+ $/; <$id_fh> };
close $id_fh;

{
    local $/ = "\n>";
    open my $seq_fh, '<', 'sample_reads.fasta';

    while (<$seq_fh>) {
        (my $key = (split)[0]) =~ s/^>//;
        next unless exists $ids{$key};
        my ($head, $seq) = split /\n/, $_, 2;
        print '>' unless $head =~ /^>/;
        print "${head}_weight=$ids{$key}\n";
        $seq =~ y/> \n//d;
        print "$seq\n";
    }

    close $seq_fh;
}
[download]

I only used the first five blocks of data from each file for my test. Here's the output:

>comp10003_c0_seq1 len=166 path=[748:0-22 1004:23-46 2527:47-165]_weig
+ht=41
AAGTAGCCTATGCGCTACAGTAAGAAAGACAGGTGAAAAAATGGAAGTAAAACAATTAGATGACTACTTT
+GGATATACAGAAAAGGGCAGTTCCTTAGAGGGGGAATTACGAGCAGGACTAACGACATTCTTGACAATG
+GCGTACATTCTGTTTGTGAACCCAGAC
>comp10004_c0_seq1 len=143 path=[2167:0-44 2322:45-68 2508:69-142]_wei
+ght=25
AATCTTTAATTTAAACTTAAAAAAAATTAACTTTTGAAAGGAATTAAAATGGAAAAAGAAATGTTAGTAG
+TAGCTAAATTAAAAGAAGGTACATTTGAAAAATTTATGGGTTTCATGCAATCGCCTGAAGGTTTAGCAG
+AAAG
[download]

-- Ken

Comment on Re: A program to extract the reads and modify the seq ID Select or Download Code