Hi guys,

Thanks for the input - both piece of code (from both monks)worked! Now I know to ditch my books and come here for enlightenment. But its not just the number of words I'm after, I need to have the file content in memory to parse and extract all proper names (i.e 2 or more consecutive words)from it.

I guess I'll have to go the sysread way.

my $buffer ; while (sysread TEXT, $buffer, $buffer_size) { ## This will use tr to count fast for you: $words += ($buffer =~ tr/ +\n+,://) ; }

I need to store $buffer in an array and then process it word by word. Is there any efficient way of doing that?

My entire code (if I dare show :)) looked something like this before:

$fname = "haystack.test"; open(TEXT, "<$fname")|| die "could not open file: $fname\n"; while (<TEXT>) { $txt .= $_; } @words = split (/[ +\n+\,\:]/, $txt); $len = @words; print "LEN = $len\n"; close (TEXT); $i =0; while( $i< $len) { my $flag2 = 1; my $sptr = my $eptr = $words[$i]; if($sptr =~ /^[A-Z][a-z]+/ ) { $eptr = $words[$i+1] ; if($eptr =~ /^[A-Z][a-z]*/ && $i< $len) { $i++; $sptr = $words[$i]; $eptr = $words[$i+1] ; $flag2 = 0; while($eptr =~ /^[A-Z][a-z]*/ && $i < $len) { $i++; #print "I =$i\n"; $sptr = $words[$i] ; $eptr = $words[$i+1] ; } if (flag2 ne 1) { print"\n";} } else {$i++; } else { $i++;} } print"\n";

So do you think i'll be alright loading all the words in an array? Or is there a better way?

Thanks
J


In reply to Re: Re: Out of memory by Anonymous Monk
in thread Out of memory by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.