In general, the recipe is to eliminate all capture groups that operate on your large string. Beyond that, you can try to cut things off as you process them. Perl keeps a marker about where a string begins so if you're contientious, you can convince perl to just advance that pointer.
This is wasteful. When it matches, it makes a copy of $_ to an internal buffer so $1 can refer back to it. Eliminate the capturing parentheses and use substr() with @- and @+ to refer back to what $1 would have contained. The documentation for @- is a good reference for you right now.
You'll notice how I used 4-arg substr to directly replace the first part of the string.
if (/\G<([^<>]*)>/gc) { flush_name(); $state = 'TEXT'; print OUT $1; next; }
Efficient.
if (/\G<[^<>]*>/gc) { flush_name(); $state = 'TEXT'; print OUT substr $_, $-[0] + 1, $+[0] - $-[0] - 1; substr $_, 0, $+[0], ''; next; }
⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊
In reply to Re: Regexes eating too much RAM
by diotalevi
in thread Regexes eating too much RAM
by Articuno
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |