Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I need to fetch every file that has a "%ff" or "%jj" or "%bb". My regular expression part is not working.
sub rout { local *FILE; if( $_ =~ /\.html?$/) { open ( FILE, $name ); while($hit = <FILE>) { if($hit =~ /(\%?:ff|jj|bb.+)/) { print "Hit = $1\n"; } } close FILE; } } find( \&rout, "/disk1/disk2" );

Replies are listed 'Best First'.
Re: File searching
by broquaint (Abbot) on Jun 20, 2003 at 11:56 UTC
    Your regex seems a little confused, it should probably look like this
    ## is the .+ necessary after bb? if( $hit =~ /%(?:ff|jj|bb)/ ) ## alternate version if( $hit =~ /%[fjb]\1\b/ )
    See. perlre for more info on the regex above.

    And being the local File::Find::Rule pimp I feel obliged to provide a code example

    use File::Find::Rule; my @files = find( file => grep => qr/%[fjb]\1\b/, in => '/disk1/disk2' );
    That should recursively find all files who's contents successfully match the regex in the /disk1/disk2 directory. See. File::Find::Rule for more info.
    HTH

    _________
    broquaint

    update: fixed dodgy regex, thanks go to diotalevi
    update 2: added a corrected version of the OP's regex

      That matches incorrectly. The regex should probably read /%([fjb])\1\b/ so that it captures one of the characters which has to repeat once.

        Thanks for all the help. It works and captures everything EXCEPT if I have multiple hits on one line it only captures the first one and then thats it for that line. I need it to capture all hits on the line. For example the below should give me 5 hits after I run the script:

        %ff other%ff
        other words%ff
        %jj %bb otherstuff here


        But my current script only gives me 3 hits because it counts one hit per line. I need it to count every match on every line. Please advise how I can get this to work now?
        sub rout { local *FILE; if( $_ =~ /\.html?$/) { open ( FILE, $name ); while($hit = <FILE>) { if( $hit =~ /%(?:ff|jj|bb)/ ) { print "Hit = $1\n"; } } close FILE; } } find( \&rout, "/disk1/disk2" );
      Why don't you just fix his regex, but you introduce some new syntax? What he needs is probably just to change the regex to:/%(?:ff|jj|bb)/. It seems to be just a simple mechanical mistake.
Re: File searching
by gjb (Vicar) on Jun 20, 2003 at 12:42 UTC

    Maybe it's not relevant, but I can't help noticing that you test $_ to check whether it ends with '.htm' or '.html', so I assume $_ contains the name of the file you want to open. The open statement uses the variable $name though.

    Although it may not be the root of the problem, I'm not sure that's what you want.

    Just my 2 cents, -gjb-