la has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

I am trying to write a perl one liner that I can input a txt file into, and output the contents of that text file in fasta format.

Text file attributes: -100,000 rows (100,000 sequences) in text file -Headings of my txt file are: Sequence, Name, Count, Countb

I want the output to look like:
>Sequence1_Name_Count_Countb

Sequence1

>Sequence2_Name_Count_Countb

Sequence2

...

>Sequence100000_Name_Count_Countb

Sequence100000

The Perl one liner I have so far is:

cat file.txt | perl -ne 'chomp;@a=split(/\t/);$a[0]=~s/\s+//g;$a[1]=~s/\s+//g;$a[2]=~s/\s+//g;$a[3]=~s/\s+//g; print ">$a[0]_$a[2]_$a[1]_$a[3]\n$a[0]\n";' > fastafile.fa

THe only thing that ends up printing at all, is the first sequence in fasta format. Nothing else prints at all. Can someone help me modify my one liner so that it prints every row of the txt file in the correct format rather than just the first? Am I missing a loop??

Thank you in advance.

Replies are listed 'Best First'.
Re: Printing all lines of a file using a perl one liner
by eyepopslikeamosquito (Archbishop) on Sep 28, 2011 at 22:31 UTC
Re: Printing all lines of a file using a perl one liner
by Util (Priest) on Sep 28, 2011 at 21:56 UTC

    Add the -w flag to turn on warnings (-wne instead of -ne).
    This might help you see that you are using $a1 $a2 $a3 where you probably meant to use $a[1] $a[2] $a[3].

    Update: I see that I misread your post, because you did not put your code into < code >...</ code > tags. This is making $a[1], etc display incorrectly in our browsers.

    Update2: I give you points for trying for code tags, even though it went awry. The "preview" button is your friend! (and I need to read more slowly )

      thanks for the tips!
Re: Printing all lines of a file using a perl one liner
by Marshall (Canon) on Sep 28, 2011 at 23:03 UTC
    I am trying to write a perl one liner that...

    You start with a problem statement who's solution is meaningless in the real world. Perl Golf is an interesting game, but I'm not sure that you really understand that you are playing golf? If this is not a golf competition, then why on earth would you write such code?

    I do not see how this code makes any sense:

    chomp; @a=split(/\t/); $a[0]=~s/\s+//g; $a[1]=~s/\s+//g; $a[2]=~s/\s+//g; $a[3]=~s/\s+//g;
    This code has nothing to do with the FASTA spec.

    I highly recommend using the BIO Perl modules. The FastA format is simple and I wrote one parser at Re: Bio perl package - I think I've written more. But even so, I don't recommend my code as the end-all and be-all.

      As it turns out, it was something really simple.... My script wasn't working because I didn't have the input file saved as the correct format. I created my txt file in excel (office for Macs 2011) and saved it as a tab delimited text file. This input did not make my script work. When I re-saved it as a Windows Formatted Text file, the script works great and my txt file is converted to fasta format no problem.

      Thank you so much for your input everyone. It is appreciated.