Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

There is something I'm not understanding about the search engine. With the code and data below I get the first substitution made but not the second, i.e., the </INSERT-TOPIC> is not substituted for </TOPIC>. When I print out the match it matches "<CASE-STUDY(^>*)>". I thought the search engine started over at the beginning of the string and that the second match should be <CASE-STUDY.*?<\/CASE-STUDY><\/INSERT-TOPIC>. So I'm not misunderstood, of course I don't expect it to match the literal ".*?" and so forth but its equivalent data. Hope this makes sense.
if ($tmp_buffer =~ /^<CASE-STUDY/i) { $tmp_buffer =~ /<CASE-STUDY[^>]*><TOPIC-INFO><TITLE>([^<]*)<\/TITLE +>/i; my($xtitle) = $1; $tmp_buffer =~ s/<CASE-STUDY([^>]*)>/<TOPIC ID="" HEADING-LEVEL="1" +><TOPIC-INFO><TITLE>$xtitle<\/TITLE><\/TOPIC-INFO><CASE-STUDY$1>/i; $tmp_buffer =~ s/(<CASE-STUDY.*?<\/CASE-STUDY>)<\/INSERT-TOPIC>/$1< +\/TOPIC>/si; print "debug - $&\n"; }
<INSERT-TOPIC> <CASE-STUDY ID="1"><TOPIC-INFO><TITLE>Case Study</TITLE></TOPIC-INFO> <PARA>poop</PARA> </CASE-STUDY> </INSERT-TOPIC>

Replies are listed 'Best First'.
Re: Why doesn't the search engine start from scratch
by ikegami (Patriarch) on Jan 27, 2012 at 23:39 UTC
    • $tmp_buffer =~ /^<CASE-STUDY/ doesn't match. /m is needed to make "^" match at the start of a line.
    • The last regex pattern doesn't match because </CASE-STUDY></INSERT-TOPIC> doesn't exist in the input. There's a newline between the two tags.
    • You don't include the last substitution in your last print.
    if ($tmp_buffer =~ /^<CASE-STUDY/mi) { ^ | $tmp_buffer =~ /<CASE-STUDY[^>]*><TOPIC-INFO><TITLE>([^<]*)<\/TITLE +>/i; my($xtitle) = $1; $tmp_buffer =~ s/<CASE-STUDY([^>]*)>/<TOPIC ID="" HEADING-LEVEL="1" +><TOPIC-INFO><TITLE>$xtitle<\/TITLE><\/TOPIC-INFO><CASE-STUDY$1>/i; $tmp_buffer =~ s/(<CASE-STUDY.*?<\/CASE-STUDY>)\s*<\/INSERT-TOPIC>/ +$1<\/TOPIC>/si; ^^^ ||| print "debug - $tmp_buffer\n"; ^^^^^^^^^^^ ||||||||||| }