puddleglum has asked for the wisdom of the Perl Monks concerning the following question:

I have a large tab-delimited data file:

e.g. Town\tCounty Westminster\tGreater London King's Cross\tGreater London Mayfair (north)\tGreater London Mayfair (south)\tGreater London ...

what i am doing is a count of some things, then adding a value to each list item;

Town\tCounty\tVALUE

love perl for how easy this (generally) is;

$tmp=(get the file contents); $tmp=~s/\r//g; $tmp="\n$tmp\n"; ($n,$v)=[got from somewhere, same as list items] ($ctr)=[got from somewhere else] ($rctr)=[count of replacements made] ## attempt 1 if ($tmp=~/^$n\t$v$/) { $tmp=~s/^$n\t$v$/$n\t$v\t$ctr/; $rctr++; } else { ....

THE PROBLEM is that if $n contains brackets () or a single quote ', then it doesn't match. Now, the following fixes the brackets, it matches fine, BUT NOT THE QUOTE. I cannot match or replace the list items with single quotes (apostrophes):

# attempt 2 $n=~s/\(/\\(/g; ## << this works ok! $n=~s/\)/\\)/g; $n=~s/\'/\\'/g; ## << same method, NO WORK!</font> if ($tmp=~/^$n\t$v$/) { $tmp=~s/^$n\t$v$/$n\t$v\t$ctr/; $gctr++; }

How do i fix this? Anyone? Help gratefully appreciated!

Replies are listed 'Best First'.
Re: Not matching a single quote (apostrophe!)
by moritz (Cardinal) on Nov 17, 2009 at 13:37 UTC
    Using \Q$n\E in the regex instead of just $n should solve your problem. See perlre and quotemeta.

    If it doesn't, please supply a short example by giving us the contents of the variables $tmp and $n at the point where it fails.

    Perl 6 - links to (nearly) everything that is Perl 6.

      Yes, I tried \Q$n\E but still fails

      $tmp contains some 27,000 lines, the entries with quotes fail

      e.g. King's Cross\tGreater London

      $n is exactly "King's Cross", for attempt 1, "King\'s Cross" for attempt 2.

      Note the brackets "Mayfair (north)" won't work in attempt 1 but will work in attempt 2 if I match "Mayfair \(north\)"

      so why not the quote????

        $tmp contains some 27,000 lines, the entries with quotes fail

        Then extract a line that should match, and doesn't, and show us that line.

        Perl 6 - links to (nearly) everything that is Perl 6.
Re: Not matching a single quote (apostrophe!)
by JavaFan (Canon) on Nov 17, 2009 at 14:06 UTC
    BUT NOT THE QUOTE
    Since a quote is not special regexp character, it doesn't need escaping. If a quote gives you a problem, you have to look elsewhere than trying to escape it in the regexp: that's not the problem.

    From the snippit you give us, I cannot determine where a quote makes a difference. Or is the given quote a different quote than you are matching against? (Someone gives you a "smart" quote for instance, which you're trying to match against an ASCII quote?)

      >> Since a quote is not special regexp character, it doesn't need escaping.

      agreed!!

      >> Or is the given quote a different quote than you are matching against? (Someone gives you a "smart" quote for instance ... )

      yeah! hate smart quotes! the data file has been 'cleaned' to ensure there are no such entities. The quote character used in the replace was cut-and-pasted and globally-replaced in the data file and my perl, so it is definitely, 100%, exactly the same!

Re: Not matching a single quote (apostrophe!)
by przemo (Scribe) on Nov 17, 2009 at 13:38 UTC
Re: Not matching a single quote (apostrophe!)
by puddleglum (Initiate) on Nov 17, 2009 at 14:43 UTC

    Ok ppl, thanks for the inspiration

    Have now fixed it, for some reason;

    $tmp=s/\n$n\t$v/\n$n\t$v\t$ctr/

    will work on non-escaped quote '

    as opposed to;

    $tmp=s/^$n\t$v/$n\t$v\t$ctr/

    or even;

    $tmp=s/$n\t$v/$n\t$v\t$ctr/

    Which both failed.

    I am sure I tried this before my jeckling with \' n \Q's n \E's, but maybe not.

    Thanks ppl!

      "^" matches the start of the *string*. Sounds like you want to match the start of the *line*. Use the /m modifier to make "^" match the start of the line. Your question had nothing to do with single quotes.

      perlre

        quite right! obvious when you look at it..