Re: Need explanation: what \$_ and ${} do
by Aristotle (Chancellor) on Nov 11, 2005 at 15:05 UTC
|
${} is a syntactic construct that dereferences a scalar reference. Its inside can be any arbitrarily complex expression. \ is used to make references, so \$_ returns a reference to $_. See perlreftut.
So in your code, $1 is assigned to $_ (because $1 cannot be modified but $_ can), then s|\s*\=\s*|=|gsi removes the spaces around any equals signs in $_, and finally a reference to $_ is returned which immediately gets dereferenced to insert its value.
This is quite obfuscated, particularly due to the massively annoying choice of | as the regex delimiter. I wonder why the original programmer used /e at all. Doing something like ${ $var = do_something(), \$var } is a common trick to embed code in contexts where it would not otherwise be possible – f.ex., you can use this to put short snippets of code into heredocs. But with /e, this is completely unnecessary, you can simply write the above code like so:
$x =~ s{ < ([^>]*) > }{ $_ = $1; s{ \s* \= \s* }{=}gsix; "<$_>" }gisex
+;
Note that I threw in an /x in both cases, so that I could use whitespace to make the patterns more readable. See perlretut (particularly, Building a regexep).
Makeshifts last the longest. | [reply] [d/l] [select] |
Re: Need explanation: what \$_ and ${} do
by polettix (Vicar) on Nov 11, 2005 at 15:27 UTC
|
<([^>]*)>
, which means "the content of every tag", i.e. the act aid = "s". Due to the round parentheses, what you match is put inside $1. Moreover, due to the gsi modifiers, the substitution will be applied to the whole string (g), without caring of case (i) and trating newlines as any other character (s).
The interesting part comes with the "e" modifier. This tells Perl that the "replacement" part of the substitution is an expression, not a bunch of characters. The espression is the following: '<' . ${($_=$1)=~s|\s*\=\s*|=|gsi,\$_} . '>'
that is the concatenation of three parts: the enclosing angle brackets and the "content". When you use the ${ EXPRESSION } construct, you are de-referencing a reference to a scalar, so:
- you have to take care that EXPRESSION evaluates to a reference to a scalar, and
- you can do whatever you want before this "return value"
EXPRESSION is:($_ = $1) =~ s|\s*\=\s*|=|gsi , \$_
that is a canned sequence of sub-expressions. The last is what "returned", and you can notice that it is a reference to a scalar (i.e. a reference to \$_), which is what we want (see first bullet). The first sub-expression is an assignment followed by yet another substitution, using the | character as a separator.
The assignment is necessary because $1 is read-only. So you use the $_ variable as a temporary copy, that you can modify via the substitution.
The substitution can be rewritten as: s{\s*\=\s*}
{=}gsi
which means that any "sequence of zero or more spaces, followed by an equal sign, followed by a sequence of zero or more spaces" is replaced by "an equal sign". Note, again, the use of the g modifier (which applies the substitution to the entire string) and of the s modifier ("treat newline as any other char"). The i modifier is not needed in this case, but it doesn't hurt.
Why do the trick <c>${ ... , \$_ }, then? Because it lets you embed code to define the "contents" part without the need to have separate statements (I guess).
Flavio
perl -ple'$_=reverse' <<<ti.xittelop@oivalf
Don't fool yourself.
| [reply] [d/l] [select] |
Re: Need explanation: what \$_ and ${} do
by blazar (Canon) on Nov 11, 2005 at 15:13 UTC
|
Need explanation
me too!
Could anyone explain what is going on in the following code.
$x = '<act aid = "s">'; $x=~s/<([^>]*)>/'<'.${($_=$1)=~s|\s*\=\s*|=|gs
+i,\$_}.'>'/egsi; print $x;
The above code is used to remove the spaces before and after equal sign.
I guess it does something more, for in that case
s/\s+=\s+/=/; # would suffice!
Whatever, it is a substitution that in the substitution part executes code performing a further substitution. Hence it's probably not the cleanest way to do what that it does (but indeed a concise one). Note that the inner part does essentially what that I do above...
I could not understand what that \$_ and ${} does
\$_ takes a scalar reference of $_. ${} dereferences it for the purpose of interpolating it. perldoc perlref will explain all this to you much better than I ever could!! | [reply] [d/l] [select] |
|
|
| [reply] |
Re: Need explanation: what \$_ and ${} do
by tilly (Archbishop) on Nov 11, 2005 at 18:51 UTC
|
In case the other responses have not made it clear, this is bad code.
In addition to the convolutions that others have pointed out, note that $_ is assigned to without localizing it first. Unfortunately $_ is a global variable used by default in many places and shared across all packages. Which means that if this piece of code is in a function that is called within a for loop or a map, you will wipe out a bunch of data and not know why it happened. | [reply] |
Re: Need explanation: what \$_ and ${} do
by chester (Hermit) on Nov 11, 2005 at 15:21 UTC
|
| [reply] [d/l] [select] |
|
|
Small side note: yes, you shouldn't use regexes to parse XML. However, in a case such as $x = '<act aid = "s">';, where it's not valid XML, you must clean it up before you can use a real XML parser to do anything with it.
| [reply] [d/l] |
|
|
No, you can’t, unless all of your tags have only one attribute. The susbtitution you give will remove the spaces around the first equals sign within the tag, but will skip any other equals signs.
Makeshifts last the longest.
| [reply] |
|
|
Why the "s" modifier does not do anything?
Flavio
perl -ple'$_=reverse' <<<ti.xittelop@oivalf
Don't fool yourself.
| [reply] |
|
|
| [reply] [d/l] |
|
|