zusuki-san has asked for the wisdom of the Perl Monks concerning the following question:

hello,
i'm trying to extract a pattern from a string, hold it in a variable, and then use this variable in a substitution.

in the code below, i extract $match from $content(which is a string holding the entire file contents).

i then extract $varmatch from $match, which i try to use in a substitution thus:
$match=~ s/$varmatch//g;
to remove $varmatch from $match.
i then insert it at another point in $match.

the strange thing is, i can use $varmatch and insert its value into the containing string ($match=~ s/textarea/testinsert$varmatch/g;), but can't use it to remove its value from the containing string.

i'm still pretty new to the wonders of perl, so i may be missing something obvious.

oh, and i've tried using pos() to position the cursor back to the beginning of $match, (pos $match = 0; ) but to no avail.

any monkly wisdom on this received with gracious thanks :)
Z.
while ($content =~ m/(<textarea.*textarea>)/sg) { my $match = $&; my $varmatch; print "match: before ..... $match\n"; if( $match =~ m/(<%=.*?%>)/sg){ $varmatch = $1; print "varmatch: $varmatch\n"; pos $match ; #cut var tag $match=~ s/$varmatch//g; print "match: after .... $match\n"; $match=~ s/textarea/testinsert$varmatch/g; print "match: after testinsert .... $match\n"; ...program continues here program output: ---------- perl ---------- varmatch: <%=foo::doSomething%> match: before .... <textarea class="FolderTxtArea" name="blah"><%=foo: +:doSomething%></textarea> match: after .... <textarea class="FolderTxtArea" name="blah"><%=foo: +:doSomething%></textarea> match: after testinsert .... <testinsert<%=foo::doSomething%> ...

Replies are listed 'Best First'.
Greed is going to get you, was Re: can't get $& to remove value in a substitution
by RMGir (Prior) on Oct 24, 2002 at 12:49 UTC
    Be careful with greedy matches. You're doing
    while ($content =~ m/(<textarea.*textarea>)/sg) {
    If your $content contains more than one textarea, this will match from the beginning of the first textarea to the end of the last one. That makes the /g fairly useless.

    You probably want:

    while ($content =~ m/(<textarea.*?textarea>)/sg) {
    Notice the *? non-greedy qualifier; that makes sure the first textarea> is matched.

    Of course, this might not be the source of the problem you're asking about, but it is one possible problem with your code.
    --
    Mike

      thanks for pointing that out...i didn't use the non-greedy qualifier because i know that i will only encounter one "textarea" in my file.
        Now you've confused me :) (Don't worry, it's easy to confuse me)

        Why use while(//g) if there's only ever going to be one match?
        --
        Mike

Re: can't get $& to remove value in a substitution
by CubicSpline (Friar) on Oct 24, 2002 at 12:39 UTC
    I don't really see a problem with the code. In fact, copying your code to my machine and running it seems to give the result you are looking for.

    Output:

    match: before ..... <textarea class="FolderTxtArea" name="blah"><%=foo +::doSomething%></textarea> varmatch: <%=foo::doSomething%> match: after .... <textarea class="FolderTxtArea" name="blah"></texta +rea> match: after testinsert .... <testinsert<%=foo::doSomething%> class=" +FolderTxtArea" name="blah"></testinsert<%=foo::doSomething%>>

    I did notice that the output you're giving in your question doesn't match the order in which the print statements appear in the code. Perhaps you're looking at old output?

    ~CubicSpline
    "No one tosses a Dwarf!"

Re: can't get $& to remove value in a substitution
by roik (Scribe) on Oct 24, 2002 at 12:53 UTC
    Strangely, the following seems to work for me:
    $content = qq(<textarea class="FolderTxtArea" name="blah"><%=foo::doSo +mething%></textarea>); while ($content =~ m/(<textarea.*textarea>)/sg) { my $match = $&; my $varmatch; print "match: before ..... $match\n"; if( $match =~ m/(<%=.*?%>)/sg){ $varmatch = $1; print "varmatch: $varmatch\n"; pos $match ; #cut var tag $match=~ s/$varmatch//g; print "match: after .... $match\n"; $match=~ s/textarea/testinsert$varmatch/g; print "match: after testinsert .... $match\n"; } }
    Gives output:
    match: before ..... <textarea class="FolderTxtArea" name="blah"><%=foo +::doSomething%></textarea> varmatch: <%=foo::doSomething%> match: after .... <textarea class="FolderTxtArea" name="blah"></texta +rea> match: after testinsert .... <testinsert<%=foo::doSomething%> class=" +FolderTxtArea" name="blah"></testinsert<%=foo::doSomething%>>
    Are you sure you are looking at the output from your latest run?
      thanks for your input monks, it's much appreciated :)
      i've discovered the problem...
      the actual match i was trying to replace has parentheses within it, which is buggering up the process.

      if i try and replace:
       $content = qq(<textarea class="FolderTxtArea" name="blah"><%=foo::doSomething %></textarea>);

      it works.
      however, if i try:
       $content = qq(<textarea class="FolderTxtArea" name="blah"><%=foo::doSomething ()%></textarea>);# note the parentheses after doSomething

      it fails.

      now i have to figure out a good way to escape the parentheses inside a variable.
      i could just go in and temporarily replace them, which seems like a bit of a hack
      if anyone is aware of a more elegant way of doing this i'd love to hear it.
      once again, my thanks for your help

      p.s. one lesson here is to copy and paste code verbatim for others to try. my apologies
      only reason i didn't was for readability- the string i was trying to replace is awful long and ugly looking :)
        This is from the perlre manpage. I've not used it before so this is a first stab:
        \Q quote (disable) pattern metacharacters till \E
        my $pattern = "ABC()DEF"; my $string = "ABC()DEF"; if ($string =~ /\Q$pattern\E/) { print "match 1\n" } if ($string =~ /$pattern/) { print "match 2\n" }
        You can see that putting your variable between \Q and \E in the regex disables the ( and ) metacharacters in the pattern. (I think!).
Re: can't get $& to remove value in a substitution
by Fletch (Bishop) on Oct 24, 2002 at 17:21 UTC

    Since no one else has yet, I'll just chime in with the obligatory statement that people trying to parse arbitrary (HT|X|SG)ML with just a single regex are just asking for all the trouble they get. Try HTML::TreeBuilder or HTML::TokeParser.

      thanks roik for that snippet on escaping metacharacters. i'll give it a whirl.

      thanks fletch for that pointer also.
      i had heard that a custom library exists for parsing SGML-like text, but am still fairly new to the perl way, and thought i might get away with my approach :)
      also, i'm not dealing with plain html- i'm converting scripts with embedded html, so wasn't sure if those parsers would fit for that task.

      on a side note, i'm very impressed with the level of feedback and support from the monks
      big thanks