Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I have a problem in specifying a pattern match where the string can include anything. I'm tring to hide passwords in text. The basic formats of the text are as follows.
Format 1
logger abcdef123,"$Ł*&GHi^
Note that both of the elements delimited by the comma are passwords. If a password contains non alphanumeric characters it has to be in quotes.
Format 2
logger "765)(?>jh",hhhhhh,joebloggs
In this example I would only want to hide elements 1 & 2. The third is a username. The code I was using which worked until the passwords were changed to include non-alphanumeric characters was
$string =~ s/(logger\s+)(\w+)(\W+)(\w+)/$1 . ("X" x length($2) . $3 . +("X" x length($4))/e;
A caveat is that if a password is inside quotes it may actually contain a comma as part of the password ! How do I change this to be able to handle any characters in the password ?

Replies are listed 'Best First'.
Re: Match any characters
by GrandFather (Saint) on Feb 10, 2006 at 09:21 UTC

    Is this what you are after:

    use warnings; use strict; while (<DATA>) { chomp; s/ (logger\s+) ("[^"]*"|\w+)(,)([^,]+)(.*) /$1 . ("X" x length($2)) . $3 . ("X" x length($4) . $5)/ex; print "$_\n"; } __DATA__ logger abcdef123,"$Ł*&GHi^ logger "765)(?>jh",hhhhhh,joebloggs logger "7,)(?>jh",yyyyyyy,fredbloggs

    Prints:

    logger XXXXXXXXX,XXXXXXXXX logger XXXXXXXXXXX,XXXXXX,joebloggs logger XXXXXXXXXX,XXXXXXX,fredbloggs

    DWIM is Perl's answer to Gödel
      Thanks. I've just discovered a slight variation on the theme in the log I'm interrogating which will need handling however.
      It seems that the first element after logger is not always a password. This field may be alphanumeric and possibly include the @ # Ł $ and _ characters (but no others) and may not be surrounded by quotes e.g.
      logger abc@def,"werty^%$&" logger ab$Łef,12trsgh logger "765)(?>jh",hhhhhh,joebloggs logger "7,)(?>jh",yyyyyyy,fredbloggs
      Where the output should be
      logger abc@def,XXXXXXXXX logger ab$Łef,XXXXXXX logger XXXXXXXXXXX,XXXXXX,joebloggs logger XXXXXXXXXX,XXXXXXX,fredbloggs
      How do I adapt the code to handle that ?

        Cleaner to do that with a number of matches:

        use warnings; use strict; use Data::Dumper; while (<DATA>) { if (/logger\s+"([^"]*)",(\w+),(.*)/s) { print "logger " . ('X' x length $1) . ',' . ('X' x length $2) +. ",$3"; } elsif (/logger\s+([^,]*),"([^"]*)"(,?.*)/s) { print "logger $1," . ('X' x length $2) . "$3"; } elsif (/logger\s+([^,]*),(\w+)(,?.*)/s) { print "logger $1," . ('X' x length $2) . "$3"; } } __DATA__ logger abc@def,"werty^%$&" logger ab$Łef,12trsgh logger "765)(?>jh",hhhhhh,joebloggs logger "7,)(?>jh",yyyyyyy,fredbloggs

        Prints:

        logger abc@def,XXXXXXXXX logger ab$Łef,XXXXXXX logger XXXXXXXXX,XXXXXX,joebloggs logger XXXXXXXX,XXXXXXX,fredbloggs

        DWIM is Perl's answer to Gödel
Re: Match any characters
by smokemachine (Hermit) on Feb 11, 2006 at 05:04 UTC
    ($passwd1, $passwd2, $user)=/^logger\s(\".*?\"|\w+)\s*,\s*(\".*?\"|\w+ +)(\s*,\s*\w+)?$/