in reply to Re: Re: C vs perl
in thread C vs perl

Do you have a better idea? It's pretty hard to know how much memory to allocate when you don't know how big your results will grow! Perl realloc()s on SVs all the time for just this reason.

I suppose he could build a linked-list of text blocks and then reassemble them into a single contiguous block at the end. I doubt that would perform better than realloc() though.

-sam

Replies are listed 'Best First'.
Re: Re: Re: Re: C vs perl
by abstracts (Hermit) on Apr 29, 2002 at 02:09 UTC
    One way to do it is by allocating string of length:
    newlen = (strlen(str) * strlen("</p><p>")) / strlen("\r\n") + strlen("</p><p>") + 1;
    which is in this case 3.5 times the length of the original string. This is the total number of bytes required in the worst case scenario: $str =~ /(\r\n)*/. Excessive memory can be reclaimed by doing a realloc *after* the substitution.

    As for perl's internal implementation, it's a different issue as the regex engine must work with any regex given. But even with that in mind, you can still build a linked list of offsets and lengths of parts in the original strings that need to copied over, as well as another list of substitutions. The required amount of memory should be easy to compute and will require doing a single copy only.

    For this example, this is like doing:

    my $str = 'line1\r\nline2\r\n"; my $result = join '</p><p>', split(/\r\n/, $str);
      That sure sounds good, but I'm not sure I'd be thrilled with the results of passing this routine 3MB of text and having it malloc() 10MB. The efficacy of malloc()ing far more than you need then hoping that realloc() is cool enough to make it not matter is questionable.

      -sam