Ok so for this research project, I have a file, with data arranged like so:

# STOCKHOLM 1.0

#=GF ID 1-cysPrx_C

#=GF AC PF10417.4

#=GF DE C-terminal domain of 1-Cys peroxiredoxin

...

#=GS D8BPP0_ECOLX/154-186 AC D8BPP0.1

#=GS D6I5T0_ECOLX/154-186 AC D6I5T0.1

...

//

...

It's basically proteins and functional groups. The functional groups are the ones in #=GF AC PFxxxx, and the proteins are the ones with #=GS D8BPP0.

so the list would have like, D8BPPO is in groups :PFxxxxx etc etc

I thought i would put the list of proteins into an array (they're in a big file) and then I'd put each protein into a scalar. Then I'd read the 2nd file, with all the data up there, with $/="\/\/"; and then split it using #. Then i'd check if it was the functional group using the grep function, then check if the protein was in the functional group. if it was, then i'd push the functional group into an array, and then at the end of the loop i'd print it out, and then go on to the next protein.

example with simplified list of proteins:

$/="\/\/"; our @acnumbers=qw(P0A252 Q9AT80 Q0HKB6); our $acnumbers; foreach $acnumbers(@acnumbers){ my $unit; foreach $unit(<PFAMDB>){ my @units= split /#/,$unit; my @pfx=grep(/=GF AC/,@units); our $units; foreach $units(@units){ if ($units=~/.*AC $acnumbers/){ push (@list, @pfx); }else{next} } } print "$acnumbers is in:"; print @list; undef @list; }

But all i get is

P0A252 is in:=GF AC PF10417.4

Q9AT80 is in:Q0HKB6 is in:

how should i improve it?

sorry for the messiness but i really just learned perl


In reply to Searching file and printing by jemswira

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.