in reply to Parse::RecDescent size limit?

Hmmm .... 24576 bytes is really not what I'd call big. 24K, which would represent 0.009% of 256M RAM. I'd guess that merlyn's supposition does not actually apply, though I wouldn't be surprised if merlyn knows somthing I don't in this regard. I've parsed 100M+ files, breaking them into huge chunks (one scalar representing about 10M, 426 times the size of your token, for example) and never had a problem.

What is the exact error message? If you aren't getting one, you may want to try a trace with something like

strace -o my-script.out my-script.pl
and then read up from the bottom of my-script.out to see if there is anything in particular that strikes you just before you see the segfault signal. Just a suggestion ....

Hope this helps :-)

P.S. Question for merlyn if he checks back: What is the nibbling approach? I performed a search here and looked through my library, but couldn't find anything ....

Replies are listed 'Best First'.
Nibbling Approach (was Re: Re: Parse::RecDescent size limit?)
by merlyn (Sage) on Jan 06, 2002 at 01:34 UTC
    P.S. Question for merlyn if he checks back: What is the nibbling approach? I performed a search here and looked through my library, but couldn't find anything ....
    Nibbling is a method of tokenization by constantly attempting substitutions at the beginning of the string, replacing it with nothing if found.
    $_ = "Hello there, people!"; @output = (); while (length) { (push @output, "word: $1"), next if s/^(\w+)//; (push @output, "spaces"), next if s/^\s+//; (push @output, "punc: $1"), next if s/^(.)//; die "how did I get here?"; } print map "$_\n", @output;
    This is fine for small strings, but the constant leftward shuffling of the string gets very expensive for huge strings. P::RD uses nibbling. P::FastDescent will use pos() scanning instead:
    $_ = "Hello there, people!"; @output = (); while (pos $_ <= length $_) { (push @output, "word: $1"), next if /\G(\w+)/gc; (push @output, "spaces"), next if /\G\s+/gc; (push @output, "punc: $1"), next if /\G(.)/gc; die "how did I get here?"; } print map "$_\n", @output;
    Much easier, but wasn't available when theDamian first wrote P::RD.

    -- Randal L. Schwartz, Perl hacker