Greetings Esteemed Monks,

I am relatively new to Perl so this may be an easy fix.

My script here is meant to import a CSV, parse, output a new string containing the elements in FASTA format into an array and write to a file.

The problem I come across is that before each new entry a blank space (\s) is inserted; I assume the problem is with the way I am exporting the array to a file but I cannot find a method which deals with the problem.

Any help? (script shown below)

use Text::CSV; use Data::Dumper qw(Dumper); print "Enter file name: \n"; my $file = <STDIN>; chomp $file; print "Enter output file name: \n"; my $ofile = <STDIN>; my $csv = Text::CSV->new({ sep_char => ',' }); my @fasta; open(my $data, '<', $file) or die "Could not open '$file' $!\n"; while (my $line = <$data>) { chomp $line; if ($csv->parse($line)) { my @fields = $csv->fields(); #print Dumper \@fields; $fields[4]=~s/\s//gs; #removes spaces within the sequence push @fasta,"\>$fields[0]\_$fields[1]\_$fields[2]\_$fields[3]\n$ +fields[4]\n"; #outputs the correct format } else { warn "Line could not be parsed: $line\n"; } } #print Dumper \@fasta; open (FH,">$ofile"), print FH"@fasta", close; end;

Sample input: (it is in TSV format but is read into the script anyway without a problem)

XP_014917420.1 CYP26A1 Acinonyx jubatus Cheetah  MGFPFFGE +TLQMVLQRRKFLQMKRRKYGFIYKTHLFGRPTVRVMGADNVRRILLGEHRLV SVHWPASVRPILGSGC +LSNLHDSSHKQRKKVIMRAFSREALQYYVPVIAEEVGTCLEQWL SCGERGLLVYPQVKRLMFRIAMRI +LLGCEPRLANGGDAEQQLVEAFEEMTRNLFSLPIDV PFSGLYRGMKARNLIHARIEENIRAKICGLRA +AEAEAEAGGGCKDALQLLVDHSWERGER LDMQALKQSSTELLFGGHETTASAATSLITYLGLYPHVLQ +KVREELKSKGLLCKSNQDNK LDMEILGQLKYIGCVIKETLRLNPPVPGGFRVALKTFELNGYQIPKGW +NVIYSICDTHDV ADIFTNKEEFNPDRFMLPHPEDASRFSFIPFGGGAKILLKIFTVELARHCDWRLLN +GPPT MKTSPTVYPVDDLPARFTRFQGET XP_002916147.1 CYP26A1 Ailuropoda melanoleuca Giant Panda +MGLPALLASALCTFVLPLLLFLAAIKLWDLYCVSGRDRSCALPLPPGTMGFPFFGETLQM VLQRRKFL +QMKRRKYGFIYKTHLFGRPTVRVMGADNVRRILLGEHRLVSVHWPASVRTIL GSGCLSNLHDSSHKQR +KKVIMRAFSREALQCYVPVIAEEVGTCLEQWLSCGERGLLVYPQ VKRLMFRIAMRILLGCDPRLASGG +DAEQQLVEAFEEMTRNLFSLPIDVPFSGLYRGMKAR NLIHARIEENIRAKICGLRTAEAASGCKDALQ +LLIEHSWERGERLDMQALKQSSTELLFG GHETTASAATSLITYLGLYPHVLQKVREELKSKGLLCKSN +QDNKLDMEILEQLKYIGCVI KETLRLNPPVPGGFRVALKTFELNGYQIPKGWHVIYSICDTHDVADSF +TNKDEFNPDRFL QPHPEDASRFSFIPFGGGLRSCVGKEFAKMLLKIFTVELARHCDWRLLNGPPTMKT +SPTV YPVDGLPARFTHFQGEI XP_006276679.1 CYP26A1 Alligator mississippiensis American Al +ligator MGFALLASALCTLLLPLLLFLAAVKLWGLYCESGRDPGCPLPLPPGTMGLPFFGETLQ +MV LQRRKFLQVKRRKYGCIYKTHLFGRPTVRVLGADNVRRILLGEHRLVAVQWPASVRTILG SGCLS +NLHDARHKQRKKVIMRAFSRDALRHYAPVMQEEVSGCLARWLGRGGACLLVYPEV KRLMFRIAMRLLL +GFEPHQADSGSERQLVEAFEEMSRNLFSLPIDVPFSGLYRGLRARNI IHARIEANIRNRMARAEPGGG +PKDALQLLLEQAQRHGQPLNMQELKESATELLFGGHETT ASAATSLITFLGLHPEVLQKVRKELQGNG +LLCSPNQDSKTLDMEVLEQLKYTGCVIKETL RLSPPVPGGFRVALKTFELNGYQIPKGWNVIYSICDT +HDVAELFTNKDKFNPDRFMSPSP EDSSRFSFIPFGGGVRSCVGKEFAKILLKIFTVELARNCDWQLLN +GPPTMKTGPIVYPVD NLPAKFVGFSGQI XP_021123924.1 CYP26A1 Anas platyrhynchos Mallard  MGFSALL +ASALCTFLLPLLLFLAAVKLWDLYCVSSRDPSCPLPLPPGTMGLPFFGETLQM VLQRRKFLQMKRRKY +GFIYKTHLFGRPTVRVMGAENVRHILLGEHRLVSVQWPGSPPPPP LPRPPGQVIMRAFSRDALQHYVP +VIQEEVSACLARWLGAAGPCLLVYPEVKRLMFRIAMR ILLGFQPRQAGPDGEQQLVEAFEEMIRNLFS +LPIDVPFSGLYRGLRARNIIHAKIEENIR AKMARKEPEGGYKDALQLLMEHTQGNGEQLNMQELKESA +TELLFGGHETTASAATSLIAF LGLHHDVLQKVRKELQVKGLLCSPNQEKQLDMEVLEQLKYTGCVIKE +TLRLSPPVPGGFR IALKTLELNGYQIPKGWNVIYSICDTHDVADLFTNKDEFNPDRFMSPSPEDSSRF +SFIPF GGGLRSCVGKEFAKVLLKIFIVELARSCDWQLLNGPPTMKTGPIVYPVDNLPTKFIGFSG QI

Sample output: (note the \s added to every entry excluding the first)

>XP_002916147.1_CYP26A1_Ailuropoda melanoleuca_Giant Panda MGLPALLASALCTFVLPLLLFLAAIKLWDLYCVSGRDRSCALPLPPGTMGFPFFGETLQMVLQRRKFLQM +KRRKYGFIYKTHLFGRPTVRVMGADNVRRILLGEHRLVSVHWPASVRTILGSGCLSNLHDSSHKQRKKV +IMRAFSREALQCYVPVIAEEVGTCLEQWLSCGERGLLVYPQVKRLMFRIAMRILLGCDPRLASGGDAEQ +QLVEAFEEMTRNLFSLPIDVPFSGLYRGMKARNLIHARIEENIRAKICGLRTAEAASGCKDALQLLIEH +SWERGERLDMQALKQSSTELLFGGHETTASAATSLITYLGLYPHVLQKVREELKSKGLLCKSNQDNKLD +MEILEQLKYIGCVIKETLRLNPPVPGGFRVALKTFELNGYQIPKGWHVIYSICDTHDVADSFTNKDEFN +PDRFLQPHPEDASRFSFIPFGGGLRSCVGKEFAKMLLKIFTVELARHCDWRLLNGPPTMKTSPTVYPVD +GLPARFTHFQGEI >XP_006276679.1_CYP26A1_Alligator mississippiensis_American Alligator MGFALLASALCTLLLPLLLFLAAVKLWGLYCESGRDPGCPLPLPPGTMGLPFFGETLQMVLQRRKFLQVK +RRKYGCIYKTHLFGRPTVRVLGADNVRRILLGEHRLVAVQWPASVRTILGSGCLSNLHDARHKQRKKVI +MRAFSRDALRHYAPVMQEEVSGCLARWLGRGGACLLVYPEVKRLMFRIAMRLLLGFEPHQADSGSERQL +VEAFEEMSRNLFSLPIDVPFSGLYRGLRARNIIHARIEANIRNRMARAEPGGGPKDALQLLLEQAQRHG +QPLNMQELKESATELLFGGHETTASAATSLITFLGLHPEVLQKVRKELQGNGLLCSPNQDSKTLDMEVL +EQLKYTGCVIKETLRLSPPVPGGFRVALKTFELNGYQIPKGWNVIYSICDTHDVAELFTNKDKFNPDRF +MSPSPEDSSRFSFIPFGGGVRSCVGKEFAKILLKIFTVELARNCDWQLLNGPPTMKTGPIVYPVDNLPA +KFVGFSGQI >ARO89874.1_CYP26A1_Andrias davidianus_Chinese Giant Salamander MSLYTLFASALCTLVLPLLLFLAAVKLWELYCISTRDRSCRCPLPPGTMGLPFFGETLQMVLQRRKFLQM +KRRKYGCIYKTHLFGRPTVRVMGAENVKQILLGEHRLVSVHWPASVRTILGSGCLSNLHDSQHKNRKKV +IMQAFSREALQHYIPVIEEEVRGALAQWLGGGGASVLVYPEVKRLMFRIAMRILLGFEPHQTDREMEQQ +LVEAFEEMIRNLFSLPIDVPFSGLYRGLKARNVIHAKIEENIRAKMAKESDTQYKDALQLLIEHTQKNG +EQLNMQELKESATELLFGGHETTASAATSLMTFLALHSDVLHKVRKELQIKDLLCDNKPLNIEALEQLK +YTGCVIKETLRLSPPVPGGFRVALKTFELNGYQIPKGWNVIYSICDTHDVAEIFPNKEEFNPDRFMSSH +PEDNSRFNFIPFGGGLRSCVGKEFAKILLKIFTVELARTCDWQLLNGAPTMKTGPIVYPVDNLPTKFIG +FNGII >XP_012310130.1_CYP26A1_Aotus nancymaae_Nancy Ma's Night Monkey MGLPALLASALCTFVLPLLLFLAAIKLWDLYCVSGRDRSCALPLPPGTMGFPFFGETLQMVLQRRKFLQM +KRRKYGFIYKTHLFGRPTVRVMGADNVRRILLGEHRLVSVHWPASVRTILGSGCLSNLHDSSHKQRKKV +IMRAFSREALKCYVPVIIEEVGSSLEQWLSCGERGLLVYPEVKRLMFRIAMRILLGCEPQLAGDRDAEQ +QLVEAFEEMTRNLFSLPIDVPFSGLYRGVKARNLIHARIEQNIRAKICGLRASEASRGCKDALQLLIEH +SWERGERLDMQALKQSSTELLFGGHETTASAATSLITYLGLYPHVLQKVREELKSKGLLCKSNQDNKLD +MEILEQLKYIGCVIKETLRLNPPVPGGFRVALKTFELNGYQIPKGWNVIYSICDTHDVAEIFTNKEEFN +PDRFMLPHPEDASRFSFIPFGGGLRSCVGKEFAKILLKIFTVELARHCDWQLLNGPPTMKTSPTVYPVD +NLPARFTHFHGEI
Any help would be greatly appreciated!

In reply to Space inserted into output file by He77e

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.