Hello!
I am trying to process records in a file based on a regex.
I have set my record separator to '>'. The new file will have
the same number of records as the input data. The only difference
will be the substitution and tranliteration of the records due to
identification by regex.
#!/usr/bin/perl use strict; use warnings; use Data::Dumper; =cut Open fasta with multiple fasta sequences. Select those sequences based on an identifier and then reverse complement them. =cut # Set record separator $/ = '>'; #open(my $in, "C:\Documents and Settings\mydir\Desktop\rev_comp\13414_ +fasta"); open(my $out, ">C:/Documents and Settings/mydir/Desktop/rev_comp/13414 +_fasta_rev_comp"); while(<DATA>){ if(my $line =~m/LacZ|SD/){ next $line; my $revcom = reverse $line; # Next substitute all bases by their complements, # A->T, T->A, G->C, C->G $revcom =~ s/A/T/g; $revcom =~ s/T/A/g; $revcom =~ s/G/C/g; $revcom =~ s/C/G/g; # Make a new copy of the DNA $revcom = reverse $line; # The Perl translate/transliterate command: $revcom =~ tr/ACGTacgt/TGCAtgca/; #print Dumper($/, $line); #} } } #close $in; close $out; __DATA__ >AM_13414L3_LacZ.SEE.rc_G01_2009-05-01.ab1 1368 0 1368 ABI TTTTTCCCCCAACAAAGGGGAGGGTGGGCGGCTAGTCTGTTCAGCTGTGT CACACCGGGATTCTCCCAATCTCTCCTCTGCAGGACCACTGGATCATTTA AATCGGTACCCATCTTCTTAGTGGGCAGACCCAGCTGGCCTTCAGACTGC TTGCTGTTCCTGGCCCGGTCTTGCTATTTATACATGTAAGAGGATCAGGA AGTCCCTGGGGTACAGCTCATAATGCCCTCCTTTGACTACATAACACCCA ACATGCTAGTTCTAAGAGAGGGAACAGTGTGCAGTGGGAAGTGGAGGGCA AAGGTGACTTGGGGCTTTCCAAAGTTCAAATTGATTCAGAGAGAGTAAAT ATTTCCAGAAGGATTTCTCCTTTTATAAAATTCATTCACTCCTTTAGCTC TGACCACAGGGTGGGAGTGAGGGATCCTTCTAGACCCCTGATGAGAGGTT AGCTTGGAGGACGCTGGCTTATGCTCATTGACAGCTGACCGACAGATATA GATTATAAAAGTAAACTTATATGTCTTGCCAGAGATATATAAAATTGTTG TCAACTCCTTCTTTAATTATTTTTCTTTAATTTTTAAAGATTTATTTTAT ATCCATGTTTTGCCTGCATGTGTGTATGTCTACCACATACATGCAGTGCT GTGCAGGTCAGAAGAGGGTGTTAAATTCCCTGGTACTAGAGTTACAGATG GTTGTGAGCCATCATGTGGATGCTGAGAACTGAAGCCAGCAAGTGTCCTT AACCGCTGAGCCAACTCTCCAGCCCCTTTAGATATTTTTAATATACTTTA ACATCAGAGGAAAAAAAAATCTTTAGAACGTCTGTCAGAAGAAACATCTA AGGCTGGTTGGGGTGGTGTTCACCACTTGGTGTCAGCACTTGGGAGCCAG AGGCAGGTGTGTGTGTGTGTTTGAGGCCAGTCTGGTCTACACACTCAGTT ATCCAATCTCCGTGAGTTTGTGAATGTTTGCTGTTCATTTGGGGTTTTAG TCTGATGTGGTCAAATAGAATAGGAAGAGAGGGCTAAAGACCCACCTTAC TGGTTTAAAGCACTTGTTGCTTTTTTAAAAAACCAAGTTTAATTCTTTCG GAGTTTCATTAGCCCTTTTTCTATTAGGGAGGGACCCCTTTTTTCTTGAT TTATAAAGGACCCCTTTGCTTGGCAATTCTGTTTTTGGGCTGGAGGGTCC AGGTTTTCCAAACTTTGGGAAATGCCTTTCCACCCTTTCTGTTCCCCTGA TGGACAATTTCCTGCCCCATGAATTTAATGGGTTTCTCTTTTATGGCTTT TTAAACATTTTTTTTTTGTTTTTTAAAAACTTTTTTCCTTTTAAACTTTT TATTTTATAATTTGAAAA >AM_13414L3_SD_F01_2009-05-01.ab1 1397 0 1397 ABI AATTTAAAGCATACTGTAAATACTACTAACTAAAGGGCAAAATAGGGCAT CAGTTTTCTTTGGAATTGGAATTATAGATAGTTTGAGCTGCCATCTAAGT GGGAATTGAACCCAGGTCCTCTGGAAGAGCAGCAGGTGCTCTTAACCACC AAGCCATCTCTCCAGACCTTGCCCATTTATCTCAATCAAATATTATGTGT AGTCATTGAGGTCAGCTTCAGACCTTCCAGGCATCTGAGTTTTCAGATGA CTGGGGTTGGCACAGACAAGTTTCCCCTCTGTGACAAAGCCAGATATGCC ACTTTAAAGTGGAACAGAAAAAAAAATGTTTATATACCTATAAAAATAAA CACTTAGAGCCACTTAGGTGGTCACTGGGGAAGACCAAAGAAAGTAGCTG GCAGTTCACACCCTTCTCTGCTAGCATAACTTCGTATAGCATACATTATA CGAAGTTATCTAGGGGCTGCAGGTCGAGGTCTGATGGAATTAGAACTTGG CAAAACAATACTGAGAATGAAGTGTATGTGGAACAGAGGCTGCTGATCTC GTTCTTCAGGCTATGAAACTGACACATTTGGAAACCACAGTACTTAGAAC CACAAAGTGGGAATCAAGAGAAAAACAATGATCCCACGAGAGATCTATAG ATCTATAGATCATGAGTGGGAGGAATGAGCTGGCCCTTAATTTGGTTTTG CTTGTTTAAATTATGATATCCAACTATGAAACATTATCATAAAGCAATAG TAAAGAGCCTTCAGTAAAGAGCAGGCATTTATCTAATCCCACCCCACCCC CACCCCCGTAGCTCCAATCCTTCCATTCAAAATGTAGGTACTCTGTTCTC ACCCTTCTTAACAAAGTATGACAGGAAAAACTTCCATTTTAGTGGACATC TTTATTGTTTAATAGATCATCAATTTCTGCAGACTTACAGCGGATCCCCT CAGAAGAACTCGTCAAAGAAGCGATAGAAGGCGATGCGCTGCGAATCGGG AGCGGCGATACCCGTAAGCACGAGGAAACGGTCAGCCCATTCGCCGCCAA GCTCTTCAGCAATATCACGGGTAGCCAACGCTATGTTCTGATAGCGGTCC CCCACACCCAGCCGGCCACAGTCGATGAATCCAGAAAAACGGGCCTTTTT CACCCTGAATATCGGCAAGCAGGCATTCGCCTGGGGTAACGACGAGTTCC TTCGCCGTCGGGCATGCCCGCCCTTGAGCCCGGGCGAACAGTTTCGGCTG GCCCCGAGCCCCCTGATGCTTCTTTCTTCCAAATTCATCCTGGTTCAAAC AGAACCCGGCTTTCCCATCCCCAATAACCTGGCCTTCCTTTCGGATGCGG AATGTTTTTCCCTTTGGGGGGGTCAAAAAGGGGGCACGGGGAGCCCN >AM_13414L3_SU_E01_2009-05-01.ab1 1447 0 1447 ABI CTCCAGCCTACCCTCTATCCAGGGGNTCTAGAGGATCCCTCACTCCCACC CTGTGGTCAGAGCTAAAGGAGTGAATGAATTTTATAAAAGGAGAAATCCT TCTGGAAATATTTACTCTCTCTGAATCAATTTGAACTTTGGAAAGCCCCA AGTCACCTTTGCCCTCCACTTCCCACTGCACACTGTTCCCTCTCTTAGAA CTAGCATGTTGGGTGTTATGTAGTCAAAGGAGGGCATTATGAGCTGTACC CCAGGGACTTCCTGATCCTCTTACATGTATAAATAGCAAGACCGGGCCAG GAACAGCAAGCAGTCTGAAGGCCAGCTGGGTCTGCCCACTAAGAAGATGG GTACCGATTTAAATGATCCAGTGGTCCTGCAGAGGAGAGATTGGGAGAAT CCCGGTGTGACACAGCTGAACAGACTAGCCGCCCACCCTCCCTTTGCTTC TTGGAGAAACAGTGAGGAAGCTAGGACAGACAGACCAAGCCAGCAACTCA GATCTTTGAACGGGGAGTGGAGATTTGCCTGGTTTCCGGCACCAGAAGCG GTGCCGGAAAGCTGGCTGGAGTGCGATCTTCCTGAGGCCGATACTGTCGT CGTCCCCTCAAACTGGCAGATGCACGGTTACGATGCGCCCATCTACACCA ACGTGACCTATCCCATTACGGTCAATCCGCCGTTTGTTCCCACGGAGAAT CCGACGGGTTGTTACTCGCTCACATTTAATGTTGATGAAAGCTGGCTACA GGAAGGCCAGACGCGAATTATTTTTGATGGCGTTAACTCGGCGTTTCATC TGTGGTGCAACGGGCGCTGGGTCGGTTACGGCCAGGACAGTCGTTTGCCG TCTGAATTTGACCTGAGCGCATTTTTACGCGCCCGGAGAAAACCGCCCTG CGGTGATGGTGCTGCGCTGGAGTGACGGGCGTTATCTGGAAGATCAGGAT ATGTGGCGGATGAGCGGCATTTTTCCGTGACGTCTTGTTGCTGCATAAAC CGACTACCCAAATCAAACGATTTCCATGTTGCCACTCGCTTTAAATGATG ATTTTCACCCCGCCTGTACTGGAGGCTGAAATTTCAAAATGGCGGGGAGT TGCGGGACTACCCTCCGGGTAAACAGTTTCTTTTATGGCAGGGGTGAAAA CCCAAGGCCGCCCACCGGCCCCGCGGCCCTTTTCGGCCGGGGAAAATTAT CCGATGAAGCGGGGTGGTTTATTGCCCAATCCGCGTCCAACCTACCTTCT GAAAAGGCCCAAAAACCCCGAAAACTGGTGGAGCCCCCCAAAAATTCCCC AAAATTTTTTTTTCTTTGGGGGGGGGGGTTGAAACCTGCACCCCCCCCCC CCCAACGGGCACCCCTTTTTATTTTGAAAAAACCAAAAAACCCCTGCCCG ACTGCTCCCCGGGTTTTTTCCCCCGCGGGAGGAGGGGGCCGGAGAAA >AM_13414L3_pgK.Neo.2fw_H01_2009-05-01.ab1 1387 0 1387 ABI AAGTTCTAATTCATCGNANCTCGCCTGCAGCCCCTAGATAACTTCGTATA ATGTATGCTATACGAAGTTATGCTAGCAGAGAAGGGTGTGAACTGCCAGC TACTTTCTTTGGTCTTCCCCAGTGACCACCTAAGTGGCTCTAAGTGTTTA TTTTTATAGGTATATAAACATTTTTTTTTCTGTTCCACTTTAAAGTGGCA TATCTGGCTTTGTCACAGAGGGGAAACTTGTCTGTGCCAACCCCAGTCAT CTGAAAACTCAGATGCCTGGAAGGTCTGAAGCTGACCTCAATGACTACAC ATAATATTTGATTGAGATAAATGGGCAAGGTCTGGAGAGATGGCTTGGTG GTTAAGAGCACCTGCTGCTCTTCCAGAGGACCTGGGTTCAATTCCCACTT AGATGGCAGCTCAAACTATCTATAATTCCAATTCCAAAGAAAACTGATGC CCTATTTTGCCCTTTAGTTAGTAGTATTTACAGTATTCTTTATAAATTCA CCTTGACATGACCATCTTGAGCTACAGCCATCCTAACTGCCTCAGAATCA CTCAAGTTCTTCCACTCGGTTTCCCAGCGGATTATAAGTGGATAAACTGT GAGAGTGGTCTGTGGGACTTTGGAATGTGTCTGGTTCTGATAGTCACTTA TGGCAACCCGGGTACATTCAACTAGGATGAAATAAATTCTGCCTTAGCCC AGTAGTATGTCTGTGTTTGTAAGGACCCAGCTGATTTTCCCACCACCCCT CCATCAGTAAGCCACTAATAAAGTGCATCTATGCAGCCACAGGTCTGTCT GCCTCTTTTGCTTCAGTTTCCTAGGACTATGGGCTGAAATTGGGCTGTTA GGGAGAAAGCATCTCACTCGCTTTTATTGAATCTGCAGTGGAAAAGAAAC AGAGGGAGTCAGGTAACTTTGAATATTTTCTTCAAAACAAAAGATATCAT GGTACAATTTTTTTTAAATTTTTTGTTTGTTTGTTTTTGTTTTTCGAGAC AGGGTTTCTCTGTGTAGCCCTGGCTGTCCTGGAACTCACTCTGTAGACCA AGTTGGCCTCCAACTCAGAAATCCGCCTGCCTCTGCCTCCTGAGTGCTGG GATTAAAGGCGTGCGCCCCCACCCCCCCGCCCCATGGTCAATTTTTAAAT TTTCCCAAAAATTATTTTTTCCCAAGGTAGACTTCTTTTTAAAGGTGGTT TTTTTACCCCCTTTTGAAAAGAAAACATTAAAGGGGATTCTTCCAAAATT TTGTGAAAGTTTTCCCCGTTTCGAATAAAAAACCCCCCTTTTCCTTTTCC GGGGATCTCCACCCTGGGTGACACTTGGTTTTTTTTACCCCCCCCCCCCT GGCCGGTTTTTTTTTTACCTGGGGGGCCTTGGGTTTA
Any direction would be of great help!

In reply to processeing records in a file by lomSpace

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.