jeanluca has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks

Although I'm using regular expressions for years, thinking that I could do almost everything with them, I realized today its not true.
The problem that I encountered had todo with a string that needs to be inserted into a MySQL database, for example:
$values = "'abc', 'dec'f', ''ghc''" ; $sql = "INSERT INTO SOME_TABLE ($columns) VALUES($values) ;
So I need todo something with this string that would result into:
'abc' 'dec\'f' '\'ghc\''
Normally I would be happy if someone gave me the answer, but this time I would really like to understand the regular expression!
So I will show below what I tried!
The first 2 thing I did were:
$insert =~ s/(?<!^)'/\\'/g ; # gives: 'abc\' \'dec\'f\' \'... # and I tried: $insert =~ /s(?<!\s)'/\\'/g ; # gives \'abc\' 'dec\'f\'...
Next step is to combine:
$insert =~ s/(?<!\s)'|(?<!^)'/\\'/g ; # wrong result :( $insert =~ s/(?<![^\s]/\\'/g ; # nope again
At this point I could really use some help on combining the 2 ?

Thans
LuCa

Replies are listed 'Best First'.
Re: string manipulations with regular expressions
by davorg (Chancellor) on Oct 17, 2006 at 08:33 UTC

    I think that you're taking the wrong approach here. I think that you should try to address this as your $values string is constructed, rather than trying to fix it right at the last minute.

    Two approaches that I can think of:

    • Use the quote method from DBI. This will make your code more portable as it will use whatever quote escaping method your current database uses - rather than hardcoding one method which won't be guaranteed to work if you ever use this code against a different database system.
    • Instead of putting $values into your SQL string, instead the relevant number of placeholders (three in this example) and pass the required strings (in a list) to that execute call. That way, DBI and your DSD handle all the quoting and escaping for you - so you don't need to worry about it.
    --
    <http://dave.org.uk>

    "The first rule of Perl club is you do not talk about Perl club."
    -- Chip Salzenberg

Re: string manipulations with regular expressions
by monarch (Priest) on Oct 17, 2006 at 08:22 UTC
    I tried the following:
    $values = "'abc', 'dec'f', ''ghc''" ; my @quotedvalues = (); while ( $values =~ m/\'(.*?)\'(?:,|\z)/g ) { my $value = $1; $value =~ s/\'/\\\'/g; push( @quotedvalues, $value ); } print( join( "\n", @quotedvalues ) );
    and got as output:
    abc dec\'f \'ghc\'

    Note that your insert statement will have to run as a loop, once for each value that you want to insert. Thus, ultimately, you want an array of values out of the provided string.

    Also note that, once you've extracted each value from the outside single quotes surrounding that value, you can use the database function ->quote() from the DBI module instead of running a regular expression like I did in the above example.

Re: string manipulations with regular expressions
by johngg (Canon) on Oct 17, 2006 at 09:46 UTC
    This seems to do what you want.

    use strict; use warnings; my $str = "'abc', 'dec'f', ''ghc''"; print qq{$str\n}; $str = join q{ }, map {s{(?<!\A)'(?!\z)}{\\'}g; $_} split m{\s*,\s*}, $str; print qq{$str\n};

    Output is

    'abc', 'dec'f', ''ghc'' 'abc' 'dec\'f' '\'ghc\''

    Cheers,

    JohnGG

Re: string manipulations with regular expressions
by jeanluca (Deacon) on Oct 17, 2006 at 09:03 UTC
    thnx a lot for the replies!

    I realize that indeed the quote() function is what I should use!

    But still, I'm very interested in my regular expression problem!
    I see that monarch solution works great.!However I remember that using variables like $1, $2 will not really improve performance (I'm talking about many of thousands of insertes...... and I like one-liners :)
    Is there no way we can fix what I've tried so far ?

    LuCa
      ...using variables like $1, $2 will not really improve performance...
      No the capture vars ($1, $2 etc.) aren't the culprits.

      From perlretut:

      If you use them, perl will set $` to the part of the string before the match, will set $& to the part of the string that matched, and will set $' to the part of the string after the match.
      ...
      It is important to note that using $` and $' slows down regexp matching quite a bit, and $& slows it down to a lesser extent...
      So $1 and friends won't effect performance.
      Try this:
      $values = "'abc', 'dec'f', ''ghc''" ; $values =~ s/ (?<!\A) # no beginning-of-string in front (?<!,\s) # no comma and space in front \' # quote (?!,\s|\z) # no command and space or end-of-string ahead / \\\' # replace with \' /xg;
      It results in:
      'abc', 'dec\'f', '\'ghc\''
      I don't believe it is possible to remove the commas between words at the same time as quoting the strings.