in reply to Replace quotation-marks with tags in a huge text-file

Try something like this (semi-tested):

use warnings; use strict; my $datamat = "He said, »Someone once said ›This is what they said‹ but I say some +thing else.«"; print qq{'$datamat' \n\n}; my $i = 0; $datamat =~ s{ » (.*?) « } { ++$i; qq{<q marker="»" sID="$i"/>$1<q marker="«" eID=" +$i"/>}; }xmsge; $datamat =~ s{ › (.*?) ‹ } { ++$i; qq{<q marker="›" sID="$i"/>$1<q marker="‹" eID=" +$i"/>}; }xmsge; print qq{[[$datamat]] \n\n};
(Assumes entire file has been read (i.e., "slurped" (update: see also File::Slurp)) into the  $datamat variable.)
(Update: Also assumes matching  » ... « and  › ... ‹ quote character pairs are never nested!)

Update: Please see perlre, perlretut, and perlrequick.

Update 2: A regex expression like  » (.*?) « may run faster if written as  » ([^«]*) « instead.


Give a man a fish:  <%-{-{-{-<