reasonablekeith has asked for the wisdom of the Perl Monks concerning the following question:

Monks,

I'm looking for a better way to do the following...

my $name = "Rob J Anderson"; my $squished_name = $name; # this is the line i don't like. $squished_name =~ s/\W//g;
Now this is a trivial example, and the actual regex doesn't matter. The point here is that I often find myself doing a substitution like this with a regex, and I don't like having to create a copy of the variable first, because I'm forced to initially assign it with a value that clearly isn't squished. In fact, 'don't like' isn't close, I really hate bad variables names.

So what I'm looking for is a idiom for performing a translation on a variable, leaving the original variable intact, and returning the translated value.

Thanks, Rob

Replies are listed 'Best First'.
Re: Idiom for a regex translation
by borisz (Canon) on Apr 22, 2005 at 09:18 UTC
    ( my $squished_name = $name ) =~ s/\W//g;
    Boris
      A slight variation could be:
      my $name = my $squished_name = 'Rob J Anderson'; $squished_name =~ s/\W//g;
Re: Idiom for a regex translation
by hv (Prior) on Apr 22, 2005 at 09:36 UTC

    I think there are two main options: squish the assignment and modification together, or make a subroutine. For clarity I'd usually recommend the latter:

    (my $squished_name = $name) =~ s/\W//g; my $squished_name = squish($name);

    Hugo

      But the subroutine just moves the problem somewhere else. Also, I think a single regex is too simple to warrant a subroutine of it's own.
      sub squish { my $unsquished = shift; .. what do you put here? return $squished; }

        sub squish { my $copy = shift; $copy =~ s/\W//g; return $copy; }

        It is not the action (the substitution) that warrants the subroutine, but the concept ("squish"). It is never the simplicity of the action that determines whether the concept is "too simple to warrant a subroutine of its own" - for example, it would be useful to do this if you were using squish() in several places in the code but might want to change it in the future, or add diagnostics etc.

        But even if it will only ever be used once, putting a block of code into a named subroutine is a very useful technique for writing self-documenting code, by associating the relevant concept with the code through the name of the subroutine.

        Hugo

        sub squish { ( my $squished = shift ) =~ s/\W+//g; return $squished; }
        But I also would not use a separate sub for this one.

        the lowliest monk

Re: Idiom for a regex translation
by grinder (Bishop) on Apr 22, 2005 at 10:07 UTC

    Other replies have shown you how to do what you want, I'll just (try to attempt to) explain why it is.

    You can't really say "Show me the hypothetical modification of this string if I substituted foo for bar". It doesn't work that way. s/// needs to operate on a string. In the process of carrying out what you ask it to do, it destroys the original contents.

    If you want to keep the original contents around, you must take a copy of it, and modify the copy. You can shorten the lifetime of the copy down to a very small point in your code by enclosing its creation and use in a { ... } lexical scope.

    - another intruder with the mooring in the heart of the Perl

Re: Idiom for a regex translation
by polettix (Vicar) on Apr 22, 2005 at 10:29 UTC
    I believed to be alone in the world of paranoid thoughts (1), but you seem to surpass me.
    ... I'm forced to initially assign it with a value that clearly isn't squished. In fact, 'don't like' isn't close, I really hate bad variables names.
    I agree with you with the genaral idea (calling the variable $another_form_for_name would be bad), but not in the application in this particular case: you're adjusting the value on the very following line, so I don't think that your assignment breaks any understandability in the code. At last, using good variable names is all about readability and understandability, isn't it?

    Probably you're happy with the previous answers, but probably you're just because your variable lives some instants in which it holds data that do not match the semantic you assign to the variable. If this is the case (skip the following if it's not, of course), the particular problem seems that you want to assign a static/fixed semantic to the variable, declared by its name. This seems near to a contradiction-in-terms, just because it's a variable, and variables should be meant to evolve during their life. You can stick with this:

    my $name = "Rob J Anderson"; (my $tmp_for_squishing = $name) =~ s/\W//g; my $squished_name = $tmp_for_squishing;
    just to be sure that $squished_name won't ever contain a non-squished value, but does it add value to the code or does it make the code harder to read?

    The general answer I received to my paranoid post can be boiled down to this: as long as the program is readable, maintainable, correct and does its job in the correct time... don't waste time on these time-consuming issues!

    (1)Testing at the right granularity and Writing general code: real world example - and doubts! for a few examples of my level of paranoid.

    Flavio (perl -e 'print(scalar(reverse("\nti.xittelop\@oivalf")))')

    Don't fool yourself.
Re: Idiom for a regex translation
by Anonymous Monk on Apr 22, 2005 at 10:45 UTC
    use Regexp::Common 'pattern'; pattern name => ['squish'], create => '\W+'; my $name = "Rob J Anderson"; my $squished_name; $squished_name = $RE{squish}->subs($name); # Following only needed because of /g: $squished_name = $RE{squish}->subs($squished_name) while $RE{squish}->matches($squished_name);
      I tried to run this, but it didn't do anything (the string was unmodified). A quick read later, and I came up with this...
      use Regexp::Common 'pattern'; pattern name => ['squish'], create => '.', subs => sub { $_[1] =~ s/\W+//g; }; my $name = "Rob J Anderson"; my $squished_name = $RE{squish}->subs($name); print "name :$name\n"; print "squished name :$squished_name\n"; __OUTPUT__ name :Rob J Anderson squished name :RobJAnderson
      Which fulfils all my requirements. I'm not sure about the 'create' parameter though. It seemed mandatory, but I can't see that it gets used in this context.

      Anyway, thanks for all the responses.

Re: Idiom for a regex translation
by ambrus (Abbot) on Apr 22, 2005 at 18:22 UTC

    If you find it ugly, hide it in a subroutine.

    In this case, do something like

    sub replace (&$) { local $_ = $_[1]; &{$_[0]}(); $_; }
    which you can later use as
    $newstr = replace { s/k/w/ } $oldstr;
    which does not change $oldstr.

    Also, you can find similar functionality in the Sed cpan module.