cghost23 has asked for the wisdom of the Perl Monks concerning the following question:

i am writing code that will go through a source file and replace all variable names with the output of a subroutine called rvg(). rvg() has random output, and everything works fine, but i would like the code to replace all instances of the same variable name with the same rvg output...but it doesnt seem to work. the relevant code is:
$/=""; while(<FILE>) { s/(?<!\\)(\$\w+(?![\[{\w]))/rvg()/ge; print; }
this works great on something like "$v1 $v2"...but if FILE contains something like "$v1 $v1", it gets replaced with like "$ghFjgjf $ldkSmd" instead of "$ghFjgjf $ghFjgjf" which is what i would have hoped. ($/ is set to "" so it only takes one pass to get through FILE, and i used /g for the regex so all instances of the matched pattern are substituted, to no avail)

Replies are listed 'Best First'.
Re: loop/substitution problem
by japhy (Canon) on Apr 29, 2001 at 01:13 UTC
    Use a cache-hash.
    s/(?<!\\)(\$\w+)(?![[{\w])/$rep{$1} ||= rvg()/ge;
    That way, "$foo" always gets replaced by the same text.

    japhy -- Perl and Regex Hacker
Re: loop/substitution problem
by Corion (Patriarch) on Apr 29, 2001 at 01:12 UTC

    Please just note that your attempt to obfuscate your source code is mostly futile and will most likely just annoy anybody wishing to modify your software. If you want to prevent people from modifying it, make your contract such. The Perl debugger and/or the Deparse module will take care of your source code efficiently.

    Having said that, the academic challenge of your question is still nice. Let me restate your problem : You want to filter out all names of all used variables (within some limits, you don't want to stomp on variables exported by other modules or Perl internal variables like $/), and change their names into something else. Note that this problem can be very hard to solve short of writing a complete Perl parser if you use modules.

    As you are using strict (and if not, you should start using strict), all you have to look for will be lines containing the my codeword, take note of the variable name, come up with a better name for it, and then substitute the new name everywhere.

    You already have a regular expression, which is not perfect but let's assume that we want to solve the first step, collecting all our own variable names, with a regular expression.

    while ( /\G.*?my\s*([@$%]\w+(?:\s*,\s*[@$%]\w+)*\)/g ) { print $1,"\n"; };

    This regular expression is nowhere near perfect, as it only matches variables declared like  my $a,$b,@c, but not variables declared like my ($a,$b,@c) = @_. I leave the generalisation to you, it's not really hard, given what you have already. In fact, even then the method has its limits, as it will be hard to detect strings that look like variable names but are single quoted strings where no variable replacement will take place.

    After you have collected the names of all variables used, you come up with new names and keep the correspondence in a hash.

    Now, all there is left to do is to go over your source code once again and substitute every occurrence of any original variable name with the new variable name.

Re: loop/substitution problem
by dvergin (Monsignor) on Apr 29, 2001 at 01:03 UTC
    Put the call to your subroutine outside the loop:
    $/=""; my $randstr = rvg(); while(<FILE>) { s/(?<!\\)(\$\w+(?![\[{\w]))/$randstr/g; print; }
Re: loop/substitution problem
by mirod (Canon) on Apr 29, 2001 at 11:02 UTC

    I agree with Corion and with a lot of others here: trying to obfuscate your source code is futile. There are laws to protect your IP, use them instead of resorting to that kind of silly trick:

    • How can I hide the source for my Perl program? (from the FAQ)
    • Protecting Perl Code
    • Obfuscate my perl code
    • Anyway, I thought your problem was a good fit for Dominus'Memoize, so here it is:

      #!/bin/perl -w use strict; use Memoize; memoize('rvg'); # now all calls to rvg are m +emoized my @rnd_names= qw( foo bar baz); $/=""; while(<DATA>) { s/(?<!\\)(\$\w+(?![\[{\w]))/rvg($1)/ge; # need to add $1 for memoize +to know which calls to memoize print; } sub rvg { return shift @rnd_names; } __DATA__ $v1= $v2; $v2++; $v3{$v1}= $v2;

      Note that the underlying mechanism is essentially the same as in japhy's solution, Memoize just handles it for you.