Hello friends, I seek help with what I realize may be an XY problem (so I will describe it generally below).

I've inherited some code that is intended to mask sensitive data in logs. The application sends data to be formatted (flattened) for logging. The data may include JSON strings. The current code uses Data::Dumper to flatten the data. The resulting string may have "sensitive" keys quoted with single quotes (from Data::Dumper) or with double quotes (from JSON). Presumably, it could also contain embedded escaped quotes.

A regular expression is used to do the work. The current implementation is broken. I'm working on a replacement, and first it looks for the "quotation mark" in use to quote the key and the value. I'm using a negative lookbehind to skip escaped quotes. This seems to work in simple matching but what I am having trouble with is using the captured "quotation mark" (including the negative lookbehind to skip escaped quotes) in a character class or in a negative lookahead.

my $param = 'password'; for ( q~{'password' => 'secret'}~, q~{"password" => "sec\"ret"}~ ) { $_ =~ s/ ( # capture everything up to the start of th +e value ( # capture the quotation mark we are usin +g (?<!\\\\) # not escaped [ ' " ] # either kind of quote ) # end capture quotation mark $param # the key \2 # the same quotation mark \s* # any amount of space (?: => | : ) # perl or JSON key-value "connector" \s* # any amount of space \2 # the same quotation mark ) # end capture everything up to start of th +e value (?: # group but do not capture the value (?!\2) . # defined as any character except the sam +e quote )* # any number of times /$1***/smxg; # the closing quotation mark will remain i +n place say $_; }
This outputs:
{'password' => '***'} {"password" => "***"ret"}

All suggestions welcome.


The way forward always starts with a minimal test.

In reply to Matching backslash in regexp negative lookbehind by 1nickt

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.