Cody Pendant has asked for the wisdom of the Perl Monks concerning the following question:

I was attempting to come up with a new function for my HTML-shortcuts system today.

I wrote a subroutine that calls on Image::Magick to get the height and width of an image.

This means all I have to do is put i myimage.gif (i for image) into the form box, and use a regex to process that line like this:

$body =~ s/\ni ([^\n]*?)\n/makeImageTag(\1)/eg;

and have the sub make the image tag for me, with the height and width inserted automatically. This is a Good Thing.

But the damn thing didn't work! So I thought "what would a true monk do?" and turned on warnings.

Cue angelic choir -- warnings tell me that I shouldn't use \1, I should use $1.

So I'm not here to ask the monks that dumb question, but can someone explain what the difference is between \1 and $1? Obviously I can't use the former when there's an expression to be evaluated...
--

($_='jjjuuusssttt annootthhrer pppeeerrrlll haaaccckkeer')=~y/a-z//s;print;

Replies are listed 'Best First'.
Re: Warnings Are Good! Plus A Question about $1
by grep (Monsignor) on Apr 07, 2002 at 06:40 UTC

    I would most likely write that regex like this:

    #!/usr/bin/perl -w use strict; foreach (<DATA>) { s/^i ([^\n]+)$/makeImageTag($1)/e; #see Update } sub makeImageTag { my $foo = shift || 'no_image.png'; print "$foo\n"; } __DATA__ i howdy.gif i foo.jpg i bar.png i blah.jpg

    Here you have perl handle the newlines

    Can someone explain what the difference is between \1 and $1? Obviously I can't use the former when there's an expression to be evaluated...

    from perlre:

    $pattern =~ s/(\W)/\\\1/g;

    This is grandfathered for the RHS of a substitute to avoid shocking the sed addicts, but it's a dirty habit to get into. That's because in PerlThink, the righthand side of a s/// is a double-quoted string. \1 in the usual double-quoted string means a control-A. The customary Unix meaning of \1 is kludged in for s///. However, if you get into the habit of doing that, you get yourself into trouble if you then add an /e modifier.

    also you can ponder this from perlre s/(\d+)/\1000/;

    Update: Jeez, I should've rewritten the whole regex. Absolutly no need for the negated char class  ([^\n]+) should be just (.+) since I thoughtfully pointed out the <> takes care of newlines



    grep
    grep> cd /pub
    grep> more beer
Re: Warnings Are Good! Plus A Question about $1
by graff (Chancellor) on Apr 07, 2002 at 06:53 UTC
    Reading the perlre man page will put hair on your chest (and for those already endowed, closer reading can cause graying). Here's a relevant nugget:
    The bracketing construct "( ... )" creates capture buffers. To refer to the digit'th buffer use \<digit> within the match. Outside the match use "$" instead of "\". (The \<digit> notation works in certain circum­ stances outside the match. See the warning below about \1 vs $1 for details.) Referring back to another part of the match is called a backreference.
    In other words, use "backslash-digit" to refer to a paren'ed chunk while you're still in the left side of the expression, use "dollar-digit" to place a chunk in the replacement pattern.
      Reading the perlre man page will put hair on your chest

      I'd imagine that kudra has already read the perlre man page, so can we assume that...

      xoxo,
      Andy
      --
      <megaphone> Throw down the gun and tiara and come out of the float! </megaphone>

Re: Warnings Are Good! Plus A Question about $1
by Dogma (Pilgrim) on Apr 07, 2002 at 12:34 UTC
    I think your question about $1 has already been answered above. However I can see some other problems with your regex.

    s/\ni ([^\n]*?)\n/makeImageTag(\1)/eg

    You are much better off using "^" to anchor the start of a new line. There is no reason to try and handle line delimiting when perl can do this for you. (even if you did want to multi-line match you should be using /m) You can drop the /g as well because if your regex is anchored to the start of the line you can't have multiple matches on the same line anyways.

    Cheers,
    -Dogma

Re: Warnings Are Good! Plus A Question about $1
by japhy (Canon) on Apr 07, 2002 at 14:48 UTC
    I'd suggest rewriting the regex as s/^i (.*)/makeImageTag($1)/egm -- the /m modifier allows ^ to match at the beginning of a "line".

    You had ([^\n]*?)\n, and that really could have been ([^\n]*)\n, because those match the same way -- there's going to be a certain number of non-newlines before the next newline, and [^\n]*\n and [^\n]*?\n will match the same thing. Also, [^\n] can just be ., which saves a bit of typing. And you really don't need to remove the newline, do you?

    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a (from-home) job
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

Re: Warnings Are Good! Plus A Question about $1
by trs80 (Priest) on Apr 07, 2002 at 19:33 UTC
      I've been meaning to say, thank you all for your input.

      I quite understand that the regex isn't perfect, and I'll attend to that.

      I can't use Image::Size because it isn't installed, but I read the documentation for Image:Magick and used its quicker "ping" method which gets basic info about graphics without having to open and read them.
      --

      ($_='jjjuuusssttt annootthhrer pppeeerrrlll haaaccckkeer')=~y/a-z//s;print;