in reply to Re: Re: Out of memory
in thread Out of memory

I'm a bit confused. Several people have suggested you read in the file line-by-line. And now you come with sysread. It will work with sysread, but not the way you do - because if you sysread halfway in a word (so the other half will be read in the iteration), you'll count a word twice. You would need to keep track of what was at the end of the previous read, and compare that with what's at the beginning of the next read.

So, why can't you process the file line-by-line?

You also say "I need to store $buffer in an array and then process it word by word". Why, oh why? It's certainly not going to solve your out of memory error. As people have indicated, that's where the root of your problem is - trying to store everything in memory.

I suggest you either follow the given advice, or you buy some more memory, because you will need to buy more if you insist of storing the entire file in memory. And keep some cash ready, you need to buy more if your file increases.

Abigail

Replies are listed 'Best First'.
Re: Re: Out of memory
by Anonymous Monk on Aug 19, 2002 at 13:54 UTC
    Hi Abigail and all,

    I think our mails must 've crossed at the same time - I didn't get to read your reply before I posted mine.

    Yes, I see your point in reading it line by line - and i've been testing on that right now. The problem is some sentences get split midway -they probably have a new line marker there. As a result I lose the info. Say for "Robert L. Stevenson" - Robert goes in the first line and L. Stevenson in the next. I could probably get around it by storing 3 lines (before, after and current) at any one time and ease the overhead.

    Thanks for the suggestions and bearing with a dufus like me :-) J

Re: Re: Out of memory
by Anonymous Monk on Aug 19, 2002 at 15:14 UTC
    Hi AbigailII,

    looks like our mails crossed at the same time - I didn't get to read your reply before posting mine.

    Yes, I realise now that the best way to go is to read in line by line and that's what i've been trying now. The reason why i've been clamouring on about keeping all the words in the memory was because on reading the file linewise, some sentences get split midway and I lose the info . Say for "John F. Kennedy",the John bit is in one line and F. Kennedy in the next.

    I guess i'll have to get around that problem by storing atleast3 consecutive lines in memory at any one time?

    Thanks for all the suggestions.
    J