gryphon has asked for the wisdom of the Perl Monks concerning the following question:

Greeting fellow monks,

I feel rather embarrased asking this question, but here goes anyway. (It's an optimization plus question.) I've recently written a little thingy that essentially lets the novice/newbie change strings within several files in a directory tree. It's written so that it'll run on both UNIX (via command line) and Win32 (via icon double click) systems.

It works, mostly. Biggest problem is that I tried to use the thing on UNIX to remove those pesky "\r"s that Windows puts in text files. Well, it ended up dumping all "\n"s as well. :(

My question/plea for assistance is two fold and thus: Can someone help me figure out why I can't substitute "\n" for "\r\n"? More importantly, and the bigger issue, there are several parts of this code that are really newbie and inefficient (both in run-speed and # of lines). Can anyone suggest how I might reduce some of the lines here? Thanks.

#!/usr/bin/perl use strict; use Cwd; use File::Find; my $changefrom; while ($changefrom eq '') { print "Change (Pattern): "; $changefrom = <>; chomp($changefrom); } print "To... (Pattern): "; my $changeto = <>; chomp($changeto); print "Match files [.]: "; my $match_files = <>; chomp($match_files); $match_files = '.' if ($match_files eq ''); my $cwd = cwd(); print "Directory [$cwd]: "; my $startdir = <>; chomp($startdir); $startdir = $cwd if ($startdir eq ''); print "Case Sens (y/n) [n]: "; my $case = <>; chomp($case); my $change_files_count = 0; my $change_times_count = 0; print "Starting search & changes...\n"; &find(\&wanted, $cwd); if ($change_files_count == 0) { print "Done! Didn't find any files to change.\n"; } else { print "Done! Made $change_times_count changes ", "in $change_files_count files.\n"; } if (($^O =~ /MS/i) && ($^O =~ /Win/i)) { print "\nPress return to exit. "; my $ending = <>; } exit; sub wanted { if ((-f) && ($_ =~ /$match_files/)) { open(DEFILE, $_) || print "Can't open file: $!\n"; my @file_contents = <DEFILE>; close(DEFILE); my $happens; if (($case eq "y") || ($case eq "Y")) { $happens = grep(/$changefrom/, @file_contents); } else { $happens = grep(/$changefrom/i, @file_contents); } if ($happens > 0) { print "Changing \"$File::Find::name\" file...\n"; my $file_contents = join('', @file_contents); if (($case eq "y") || ($case eq "Y")) { $file_contents =~ s/$changefrom/$changeto/g; } else { $file_contents =~ s/$changefrom/$changeto/gi; } open(DAFILE, "> $_") || print "Can't write to file: $!\n"; print DAFILE $file_contents; close(DAFILE); $change_times_count += $happens; $change_files_count++; } } }

Thanks in advance. :) Keep in mind that I need this to run on both UNIX and Win32, and it needs to be very user friendly. As in, the most basic novice could figure it out.

-Gryphon.

Replies are listed 'Best First'.
Re: Change utility; code optimization
by zigster (Hermit) on Feb 09, 2001 at 23:10 UTC
    perl -n -e 's/\r$//;print "$_";' flange.txt > op

    works just fine for removing the ^M from flange.txt, I could not spot in your code where you were trying to do the ^M translation. As to how to how to imporove it generally. I will get back to ya on monday, sorry but it is my home time now.
    HTH (a bit)
    --

    Zigster
      How about:
      perl -ne 'tr/\r//d;print;' flange.txt > op

      --
      Me spell chucker work grate. Need grandma chicken.

        </code>perl -pi.bak -e 'y/\r//d' flange.txt</code>

        Generally, when you see -n and ;print' in a one liner, think -p and hug a camel. =)

        --
        $you = new YOU;
        honk() if $you->love(perl)

Re: Change utility; code optimization
by coolmichael (Deacon) on Feb 10, 2001 at 11:18 UTC
    I can't offer any explanations why it isn't working for \n\r but I can offer some suggestions:
    undef $/; $filecontemts=<DEFILE>
    will slurp all of the file into $filecontents, then you don't need to worry about using join.
    print "Changing \"$File::Find::name\" file...\n" if ($happens=($file_c +ontents =~ s/$changefrom/$changeto/g));
    should make the substitution and return the number of times it happened all at once. It will only print if $happens isn't zero. It's a little hard to read, but you get used to it quickly. Then you are only doing one regex on the data, rather than two. (vice versa for case insensitive matching.)

    I don't know if if($code =~ m/yY/) would be faster than testing for equality, but it is easier to read (at least for me anyway. YMMV)

Re: Change utility; code optimization
by gryphon (Abbot) on Feb 12, 2001 at 22:25 UTC

    Greetings all,

    I've discovered something interesting. When running the above, I can enter "\n" in the change from field and the script will correctly interpret it to mean a newline character. When I enter "\n" in the change to field, the script incorrect interprets it to mean literally "\n" rather than a newline.

    Looking at my code, I have a line near the end of the script:
    $file_contents =~ s/$changefrom/$changeto/gi;

    Can anyone explain why $changefrom is interpreted, but $changeto is not? Thanks.

    -Gryphon.

      This is a good question. The reason is that, on the regex side, the two character sequence "\n" is a metacharacter that means "match a literal newline".

      After the value of $changefrom is interpolated, the regular expression engine compiles the regex, sees "\n", and compiles it to match a newline. (If $changeto instead contained a literal newline, the result would be the same, because a literal newline in a regex matches a literal newline.)

      On the replacement side, however, once the value of $changeto is interpolated, there is no second pass over the string to turn "\n" into a newline.

      One solution is to code the extra pass over $changeto yourself, as in:

      $changeto =~ s/\\n/\n/g; $file_contents =~ s/$changefrom/$changeto/gi;
      Please note that that simple example will change \\n to \<newline>.

      Of course, you then have to consider which escape sequences you want to allow. \t and \r? How about \040, \x20, or \cD? That's up to you. :)

        Greetings chipmunk,

        Thanks for your help. Very good information. I have one follow-up question: Is there a way to default allow all escape sequences? In another words, allow $changefrom and $changeto to be interpolated exactly the same way? That way, a user could run the script and change "\n" to "\t" literally.

        Gryphon.