in reply to search and print in perl

...working on a homework... - $honesty++.

Some sample data, outlining both matching and non-matching cases, would help - IMO, your description doesn't quite cut it for me - I'm barely a proficient programmer, I'm most certainly not a geneticist e.g. in this case, is a gene represented by a single char, a sequence of chars, or both?

That being said...

1. This:

. . $text = ""; while($line = <IN>) { $text .= $line; } . .
is more usually (and in most cases, better) written as...
local $/; # Ensure line-ends are ignored $text = <IN>; . .
2. AFAICT i.e. subject to further details being provided, your RE appears to only capture start & end delimiters.

A user level that continues to overstate my experience :-))

Replies are listed 'Best First'.
Re^2: search and print in perl
by hellworld (Novice) on Jun 01, 2009 at 12:06 UTC
    Thanks. A gene is like this: it is preceded by a string TATAAT and after this string there can be one or many strings of letters A,C,G,T . then ATG string follows them, then again random amount of A,C,G,T's follow it and the gene ends with one of the strings TAA, TGA or TAG. for example a gene is TATAATATTACAATGGATCATACAGTTAG ... our gene is the part between ATG and TAG but we also have to make sure it is preceded by a TATAAT.. I have to print out the genes in the txt file according to these rules.
      Assuming a definition per line i.e. not split over multiple lines, then...
      use warnings; use strict; local $/; my $data = <DATA>; while ($data =~ /TATAAT[ACGT]+ATG([ACGT]+)(:?T(:?GA|AA|AG))/cgs) { warn $1; } __DATA__ TATAATATTACAATGGATCATACAGTTAG TATAATATTACAATGGATCATACAGTTAG TATAATATT ACAATGGATCATACAGTTAG
      produces:
      $ perl tst.pl GATCATACAGT at tst.pl line 8, <DATA> chunk 1. GATCATACAGT at tst.pl line 8, <DATA> chunk 1. $
      A user level that continues to overstate my experience :-))