in reply to Re: writing array element to a file
in thread writing array element to a file

Hi! Thanks everyone for the help, I know I did a lot of stupid errors, but I just started with perl. I tried to run the script you suggested me, but it wasn't exactly what I need to do. I need to copy in the stdout only the number after the >, without the following lines. These are few lines of my input file:
> 0 CATCAAAATAATCATGTATTGGTAAAAGTTTAGTAAAAATACTAAAACTATTGACAATTCAAACTAATAC +TTGTATAATGGAAGCGTATTCAAAAAATAAACAGGAGGTTCTCATAATGAGAAAATCTAACGTTCAGAT +GAAGTCTCGTCTATCCTATGCAGCGGGTGCTTTTGGTAACGACGTCTTCTATGCAACGTTGTCAACATA +CTTTATT > 1 CATCAAAATAATCATGTATTAGTAAAAGTTTAGTAAAAAATACTAAAACTATTGACAATTCAAACTAATA +CTTGTATAATGGAAGCGTATTCAAAAAATAACAGGAGGTTCTCATAATGAGAAAATCTAACGTTCAGAT +GAAGTCTCGTCTATCCTATGCAGCGGGTGCTTTTGGTAACGACGTCTACTATGCAACGTTGTCAACATA +CTTTATT > 10 CATCAAGATAACCATGTATTAGTAAAATTTTAGTAAAAAACACTGAAATTATTGACTGCATAAACCAATT +TTCATATAATGTAAACGTATTCAAATAATAGGAGGTTTCCGAAATGGAAAAATCTAAAGGTCAGATGAA +GTCTCGTTTATCCTACGCAGCTGGTGCTTTTGATAACGACGTCTTCTATGCAACCTTGTCAACATTACT +TTATC > 100 CATCAAAATAATCATGTATTAGTAAAAGTTTAGTAAAAATACTAAAACTATTGACAATTCAAACTAATAC +TTGTATAATGGAAGCGTATTCAAAAAATGACAGGAGGTTCTCATAATGAGAAAATCTAACGTTCAGATG +AAGTCTCGTCTATCCTATGCAGCGGGTGCTTTTGGTAACGACGTCTTCTATGCAACGTTGTCAACATAC +TTTATT ...
I want to copy only the numers.... Running the sipt you suggested, I obtained this:
> 0 CATCAAAATAATCATGTATTGGTAAAAGTTTAGTAAAAATACTAAAACTATTGACAATTCAAACTAATAC +TTGTATAATGGAAGCGTATTCAAAAAATAAACAGGAGGTTCTCATAATGAGAAAATCTAACGTTCAGAT +GAAGTCTCGTCTATCCTATGCAGCGGGTGCTTTTGGTAACGACGTCTTCTATGCAACGTTGTCAACATA +CTTTATT > 1 CATCAAAATAATCATGTATTAGTAAAAGTTTAGTAAAAAATACTAAAACTATTGACAATTCAAACTAATA +CTTGTATAATGGAAGCGTATTCAAAAAATAACAGGAGGTTCTCATAATGAGAAAATCTAACGTTCAGAT +GAAGTCTCGTCTATCCTATGCAGCGGGTGCTTTTGGTAACGACGTCTACTATGCAACGTTGTCAACATA +CTTTATT
How can I modify it? Man thanks for your help!! Francesca

Replies are listed 'Best First'.
Re^3: writing array element to a file
by Random_Walk (Prior) on Apr 25, 2013 at 13:11 UTC

    So you have a series of lines that look something like this

    > 22 GATTGATGCC... > 2 GATGGATGTG... > 26 GATGCATGAT... > 52 GATGATGTGG...

    And in your output file you just want the numbers.

    The split we had in the original code splits each line on spaces into the @vettore array. If we only want to print the second element of this array (the number) then we do not need the foreach loop. We can alter our print to directly address the second element of the arrayprint $out "$vettore[1]\n"; (array indexing starts at 0). Here is our new line processing block:

    while (my $line=<$in>) { if($line=~/^>/) { my @vettore=split(/\s+/, $line); print $out "$vettore[1]\n"; } }

    For fun it can also be done as a one liner. Here I added a bit more checking of the line to ensure it had some GATC characters following the number

    perl -nle "if (/^>\s+(\d+)\s+[GATC]+/) {print $1}" rep_set_ass_tax.fna >> seq_id.txt

    If you are not on windows you may need to change the two " quotes to ' quotes.

    Cheers,
    R.

    Pereant, qui ante nos nostra dixerunt!
      Hi R! I modified the script as you suggested, while (my $line=<$in>) { if($line=~/^>/) { my @vettore=split(/\s+/, $line); print $out "$vettore1\n"; } } but the output file only contained 0, which is the number of the first line. What's I did wrong?

        It looks like you missed the square brackets around the array index in the print line.

        print $out "vettore1\n"; # bad print $out "vettore[1]\n; # good

        Cheers,
        R.

        Pereant, qui ante nos nostra dixerunt!