in reply to Replace quotation-marks with tags in a huge text-file

Here are some things you can do to improve your code:

First, if you can't process the file line by line, don't read it line by line. You can read it all at once like this:

my $data; { local $/ = undef; # local means this change will be reversed at the +end of the block $data = <$in_fh>; }
$/ is the input record separator, by default it's "\n" which tells perl to read the file and stop everytime it encounters a newline. You could also set it to "", and perl will read the file one paragraph at a time (at least two consecutive "\n").

while ($data =~ /REGEX/g) { $data =~ s/REGEX/rep/; }
Here the /g is useless, because the string changes between each call to the first regex, so perl reads the string (your complete file) from the beginning each time. $data =~ s/REGEX/rep/g; Would do what you want, except for the part that you haven't figured out yet, with your non constant replacement.

The s operator allows the right side to be dynamic with the /e switch. To use it, just write perl code that would return what you want, ex: $i_want = "string, $1".$i++."another string"; And put that in the right side of your replacement and add /e: s/REGEX/"string, $1".$i++."another string"/e;. Notice that I have kept the same code, including the quotes.

This last part even allows you to do all your matches at once. If you regex is />(.*?)<|»(.*?)«/ you can write: $i_want = "String".($1||$2)." ".($i++); So: s/>(.*?)<|»(.*?)«/"String".($1||$2)." ".($i++)/gse;

Replies are listed 'Best First'.
Re^2: Replace quotation-marks with tags in a huge text-file
by kemuel (Novice) on Sep 12, 2015 at 15:51 UTC

    Thank you so much. That really simplified my script a lot.
    And processing the file now takes a few seconds instead of an hour.

    I'm really happy right now