Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I would like to pass regex patterns to a subroutine. Obviously, I am doing it incorrectly. I think the literals are being passed to the subroutine instead of the regex. Any suggestions?
#!/usr/bin/perl use strict; my @ids = qw( john james ); my $permission = get_data( q{^$_}, @ids ); print $permission; my $permission = get_data( q{"\\b$_\\b"}, @ids ); print $permission; sub get_data { my ($pattern, @ids) = @_; my $permission; for (@ids) { $permission = qx( grep $pattern /etc/passwd ); } return $permission; }

Replies are listed 'Best First'.
Re: Passing Regex Pattern in Subroutine
by ikegami (Patriarch) on Aug 16, 2011 at 19:15 UTC

    I think the literals are being passed to the subroutine instead of the regex.

    Say what?

    A literal is a piece of code. It cannot be passed to a subroutine. It can just be parsed and evaluated.

    The string literal «q{^$_}» produces the three character string «^$_».

    This string is placed in variable $pattern, to be interpolated by another string literal, «qx( grep $pattern /etc/passwd )». When $pattern contains «^$_», this literal produces the string « grep ^$_ /etc/passwd» which is then executed as a shell command.

    Upon receipt of the command « grep ^$_ /etc/passwd», the shell will proceed to interpolate the value of $_ into the command, producing the string « grep ^/bin/sh /etc/passwd». It then proceeds to execute grep with arguments «^/bin/sh» and «/etc/passwd».

    #!/usr/bin/perl use strict; use warnings; sub text_to_shell_lit(_) { return $_[0] if $_[0] =~ /^[a-zA-Z0-9_\-]+\z/; my $s = $_[0]; $s =~ s/'/'\\''/g; return "'$s'"; } sub get_data { my ($pattern) = @_; my $pattern_lit = text_to_shell_lit($pattern); return qx( grep $pattern_lit /etc/passwd ); } my @ids = qw( john james ); { # Assumes each of @ids are "safe". my $pattern = '^\\(' . join('\\|', @ids) . '\\)'; my $permission = get_data($pattern); print($permission); } { # Assumes each of @ids are "safe" and match /^\w/ and /\w\z/. my $pattern = '\b\\(' . join('\\|', @ids) . \\)\b'; my $permission = get_data($pattern); print($permission); }

    Better yet, avoid the shell and add error checking with the use of IPC::System::Simple.

    #!/usr/bin/perl use strict; use warnings; use IPC::System::Simple qw( capturex ); sub get_data { my ($pattern) = @_; return capturex('grep', $pattern, '/etc/passwd'); } my @ids = qw( john james ); { # Assumes each of @ids are "safe". my $pattern = '^\\(' . join('\\|', @ids) . '\\)'; my $permission = get_data($pattern); print($permission); } { # Assumes each of @ids are "safe" and match /^\w/ and /\w\z/. my $pattern = '\b\\(' . join('\\|', @ids) . \\)\b'; my $permission = get_data($pattern); print($permission); }

    Much better yet to simply avoid creating another process entirely.

    #!/usr/bin/perl use strict; use warnings; sub get_etc_passwd { open(my $fh, '<', '/etc/passwd') or die $!; return <$fh>; } my @ids = qw( john james ); my @etc_passwd = get_etc_passwd(); { my $pattern = join('|', map quotemeta, @ids); my $re = qr/^(?:$pattern)/; print(grep /$re/, @etc_passwd); } { my $pattern = join('|', map quotemeta, @ids); # Assumes each of @ids match /^\w/ and /\w\z/. my $re = qr/\b(?:$pattern)\b/; print(grep /$re/, @etc_passwd); }
Re: Passing Regex Pattern in Subroutine
by toolic (Bishop) on Aug 16, 2011 at 17:28 UTC
    I think you're looking for something more along these lines:
    use warnings; use strict; my @ids = qw( john james ); my $permission = get_data( map { "^$_" } @ids ); print "$permission\n"; $permission = get_data( map { q{"\b} . $_ . q{\b"} } @ids ); print "$permission\n"; sub get_data { my (@patterns) = @_; my $permission; for (@patterns) { $permission .= qx( grep $_ /etc/passwd ); } return $permission; }
      Yes, this is what I am looking for. But I don't understand why you had to use 'map' to pass the regular expression.

      Is it because when you run get_data( map { "^$_" @ids } ), it is like doing get_data( "^john" ) and that is being passed to the subroutine?

        I don't understand why you had to use 'map' to pass the regular expression
        You could use map inside your sub, if you prefer.
        Is it because when you run get_data( map { "^$_" @ids } ), it is like doing get_data( "^john" ) and that is being passed to the subroutine?
        Not exactly. The map is like doing:
        get_data( '^john' , '^james' );
Re: Passing Regex Pattern in Subroutine
by Perlbotics (Archbishop) on Aug 16, 2011 at 18:41 UTC

    Bug: Your get_data() returns only the last hit (e.g. for james).

    FWIW, see also:

    • qr{} operator in Regexp-Quote-Like-Operators in perlop to pass a regexp instead of a pattern-string...
    • getpwent - to get rid of grep calls (access /etc/passwd and /etc/shadow directly) - the history of Perl as a sysadmin language provides us with build-in support...

    Maybe it is easier to read /etc/passwd (root: /etc/shadow too) into a hash once and then access the hash by user-id?

    use strict; use warnings; use Data::Dumper; my @items = qw(name passwd uid gid quota comment gcos dir shell expire +); my %user_data; # Slurp /etc/passwd once and create hash. # When executed as root, entry 'passwd' will be visible (not 'x'). while ( my @pwdata = getpwent ) { my $name = $pwdata[0]; $user_data{$name} = { map{ $_ => shift @pwdata } @items }; } # DONE! The rest is example code. # see full datastructure # print "Dump: \n", Dumper(\%user_data), "\n"; # sample output my $name = 'root'; print "=== $name ===\n"; print Dumper( $user_data{ $name } ),"\n"; print "shell: $user_data{$name}{shell}\n"; __END__ === root === $VAR1 = { 'quota' => '', 'uid' => 0, 'name' => 'root', 'dir' => '/root', 'passwd' => 'x', 'comment' => '', 'shell' => '/bin/bash', 'expire' => undef, 'gid' => 0, 'gcos' => 'root' }; shell: /bin/bash

    ...or find a CPAN module that suits your requirements.

    Update: This example covers the get_data() part of your question.

Re: Passing Regex Pattern in Subroutine
by ww (Archbishop) on Aug 16, 2011 at 18:02 UTC

    Line 17 looks like some part of your problem: If -- as appears from your use of qx() -- you're trying to use the system grep,

    Use grep to search file
    
    Search /etc/passwd for boo user:
    $ grep boo /etc/passwd
    
    You can force grep to ignore word case i.e match boo, Boo, BOO
    and all other combination with -i option:
    $ grep -i "boo" /etc/passwd
    

    But, as is in your code, methinks the system will eat the $pattern.

    once you solve passing the var, this may be helpful: another *nix grep discussion points out that...

    -e PATTERN, --regexp=PATTERN    Use PATTERN as the pattern....

    OTOH,you may find it useful (and more perl-ish) to user Perl's own grep which uses this syntax (from perldoc -f grep):

    @foo = grep(!/^#/, @bar);    # weed out comments

    In that case, I think you would do well to use qr at line 5.

    Update: nobody home when the now-striken sentence was written. Echhh!