Dear Monks,

I have got 2 files which I need to parse based on certain features but the files are too big as much as 3 GB so I am unable to use array or even storage variables.

The format of the 2 files with example is :-

1) File format of one file :
>Harvard 32384743 234394583 John1 15.T >MIT 13249304 545924582 Smith32 7.A >Cambridge 76323823 983438434 Gold1234 17.G
2) File format of the second file is :
>John1 40 34 40 40 25 40 40 40 40 17 40 40 40 20 40 40 40 20 40 40 40 30 40 4 +0 19 40 40 40 37 40 11 40 40 35 25 40 >Smith32 40 40 44 13 40 40 40 50 40 40 40 40 50 40 40 40 16 40 6 40 40 45 40 40 + 40 2 40 40 40 40 29 40 40 40 6 40 >Gold1234 40 40 15 40 39 40 40 40 40 66 40 40 35 40 40 40 10 40 40 40 40 27 40 4 +0 40 12 40 40 33 40 40 40 40 4 40 40 --------------------------- END -------------------------
Now this 15.T , 7.A and 17.G are the locations in the second files. e.g, 15.T means 15th position in John1 record of file file 2. Now I have to apply this formula that any locations score should be >= 20. If so I have to display its name in the output file:-

For Example 15.T means 15th location in John1 record in File 2. Since the 15th position is 40 which is greater than 20 my result should come like this:-

3) Output File
John1 15.T 40

My PERL knowledge is basic so I would be obliged to get help from Monks. Please remember that I cannot store anything in arrays or variables since I have to parse 3 GB file.

Thanks

In reply to Parsing of 3 GB File by ashnator

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.