in reply to regexp for directory

When I run this code, all I get is this warning:
> 674562.pl foo foo in /tmp/*.pl: Use of uninitialized value in pattern match (m//) at 674562.pl line 26 +.

Line 26 is:

if ($this =~ /$bit/) {

Since your documentation is a little unclear, I am not sure what this code should do. My guess is that you want to imitate the unix grep command, except that you want a count of all occurances of your regexp, rather than a count of all the lines on which the regexp occurs. If that is the case, I think $this = do {local $/; <PL>}; if ($this =~ /$bit/) should not be inside the while loop. The entire file's contents are slurped into $this when you unset the input record separator, $/.

Update: copy'n'pasted the wrong code. Thanks my_nihilist.

Did you test this code yourself?

I have some other critiques:

Many of these guidelines can be found in the book Perl Best Practices. It is a good investment.

Replies are listed 'Best First'.
Re^2: regexp for directory
by my_nihilist (Sexton) on Mar 17, 2008 at 16:36 UTC
    I used this on linux and it works exactly as indicated, eg. "./test.p sort" produced:
    sort in /home/me/perl/*.pl: 6 -- big.II.pl 4 -- big.pl 1 -- sortoccurances.pl 1 -- example3.pl 1 -- test.pl
    Nb. toolic "if ($this =~ /$bit/)" never was inside the while loop! I am sure that would be a problem. I am also sure from reading halfcountplus's other posts (ranking number of occurances) that slurping the entire file is intentional (how else could this work?). However, if i got an error, i would be suspicious too.

    Perhaps if you replace the "/" with a "\" (line 20)?
Re^2: regexp for directory
by halfcountplus (Hermit) on Mar 17, 2008 at 19:55 UTC
    thanx for your feedback! i still don't understand why it didn't work tho!

    the useful parts

  • leave nothing inside while loop = no while loop!
  • use "die" with open/opendir
  • no need to quote "$dir" in open

    the ignored parts

  • i think my variable names are more meaningful than most, actually. They are distinctly different from one another and they are short. What would you call "%hash", "%associativearraywithcountforeachfile->key"? There is only one hash, and %hash is it. "$this" appears 3 times across 6 lines...it could be "$filecontent" i guess but i use "this" and "that" like tweedledee uses tweedledum. Kind of. "this" and "that"; it's cute ;P

  • whitespace, smightspace. what about the aesthetic value of having the last two lines the same length? Surely that contributes to readability, albeit "in a different sense".

    thanks again -- take care

      You can get meaningful without hyperextraneoverbositude. %matches_in_file or %count_for_file are extremely descriptive without requiring me to read the entire goram piece of code to figure out what exactly is going in %hash. Any decent editor will also let you autocomplete the name after the first one or two times anyhow so the overall length of the name isn't an excuse. And if you're going to be lazy-cutesy using the default subject variable $_ at least has the virtue of possibly shortening your code.

      Absolutely context free names like "this" and "that" just mean the maintenance programmer that follows n months hence is going to curse your crappy style, not praise your brevity and wit.

      Addendum: As to the lack of whitespace in the penultimate line, I'd just say it's people who write stuff like that in production code that give Perl the (somewhat deserved :) reputation for being executable line noise. Without reasonable whitespace you've got to scan back and forth to see where the breaks are (of course Mr. Maintenance programmer probably just learns to run anything you ever wrote through perltidy and tosses the originals away day one . . . ).

      The cake is a lie.
      The cake is a lie.
      The cake is a lie.

        Since no one came to my defense on this and this is a public forum, i've bowed to the public pressure regarding my variable names and use of whitespace. ie,

        No more thisandthat! No more thisandthat! No more thisandthat!