in reply to An interesting Perl problem to extract file content

Tasks like this usually involve simple programs, and a lot of regular-expressions.   (So if you have not yet studied Perl regular expressions, do so now.)

The program will read the file line-by-line, chomp each line, and (presumably) concatenate strings until a meaningful group of lines has been read.   (This can be easy or hard.   If the file always contains, say, a >first line that is always followed by five lines of DNA, it’s easy.   But you should always validate the input data in case either your program, or the data file, or both, has a bug in it.)

As you will learn in your studies of Perl regexes, regexes are a power-tool that is literally built for the task of ripping strings apart.   So, if you have even the slightest bit of uncertainty of what I am talking about right now, start there, and let the molecules fend for themselves for a few days more.

Replies are listed 'Best First'.
Re^2: An interesting Perl problem to extract file content
by Anonymous Monk on Dec 08, 2010 at 22:09 UTC

    hii..thanks for your reply..the prob is that there can be more or less lines after both >first and >firsta. No fixed number of lines or strings...

    >firsta is going to have lesser number of molecules than >first as there is unmatched portion in >firsta

    But >fist and >firsta have same number of lines..

    for eg..If no. of lines for >first is 8, then it is same for >firsta

    Seems like I am totally lost in regex of Perl...studying for last few days n I am frustrated now :(

    any suggestion?