Re: \q quote-matching operator
by PodMaster (Abbot) on Aug 26, 2004 at 12:46 UTC
|
In the meantime, use this as perl -MRegexp::Q -ne "print if /\pQ.+\pQ/" :)
package Regexp::Q;
use overload;
sub import {
shift;
die "No argument to ${\__PACKAGE__}allowed" if @_;
overload::constant 'qr' => \&convert;
}
sub invalid { die "/$_[0]/: invalid escape '\\$_[1]'"}
use vars::i '%rules' => (
'\\' => '\\',
'pQ' => qr/['"`]/,
'PQ' => qr/[^'"`]/,
);
sub convert {
my $re = shift;
warn "rei is $re";
$re =~ s' \\ ( \\ | [pP]Q ) ' $rules{$1} or invalid($re,$1) 'sgex
+;
return $re;
}
package main;
unless(caller){
BEGIN{import Regexp::Q}
print "YAY!$/"
if q~PodMaster asked me "Do you like parachute pants?"~ =~ /\p
+Q/;
}
MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!" | I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README). | ** The third rule of perl club is a statement of fact: pod is sexy. |
| [reply] [d/l] |
Re: \q quote-matching operator
by diotalevi (Canon) on Aug 26, 2004 at 13:11 UTC
|
package Regexp::Quotes;
use overload;
sub import {
shift;
die "No argument allowed to Regexp::Quotes::import" if @_;
overload::constant qr => \ &convert;
}
sub invalid {
die "/$_[0]/: invalid escape '\\$_[1]'";
}
my %rules = (
'\\' => "\\",
'q' => qr/[\'\" .... ]/ ); # Extend this
sub convert {
my $re = shift;
$re =~ s( \\ ( \\ | q ) )
{ $rules{$1} or invalid( $re, $1 ) }gex;
$re;
}
| [reply] [d/l] |
Re: \q quote-matching operator
by gellyfish (Monsignor) on Aug 26, 2004 at 12:23 UTC
|
I think that it would be a little confusing to use \q as the \Q is already used for something else and the \ regex operators are generally arranged in uppercase/lowercase pairs of related meaning. I would probably suggest something in the form of a POSIX character class like [:quote:] and possibly with a unicode class like equivalent of \p{IsQuote}. /J\
| [reply] [d/l] [select] |
Re: \q quote-matching operator
by TedYoung (Deacon) on Aug 26, 2004 at 12:30 UTC
|
If you were trying to match quotes in text taken from a MS Office document, then that would explain the trouble you were having. By default, MS converts " and ' into "Smart Quotes".
A similar problem for \q is it would need to be locale dependant since different languages use different quoting schemes.
/\".+\"/
seems to work at first, and it will work for very simple cases, but it would probably not do what you wanted for the expresssion
ABC "DEF" GHI "JKL" MNO
where it would match "DEF" GHI "JKL". To match "DEF" and "JKL" seperately, you would need something like:
/\".+?\"/
We probably should use .*? instead of .+? to handle things like "". But, if you are dealing with code, none of these solutions handles things like "AB\"C". There is a common boilerplate regex for this, but at this point you may want to take a look at the documenation for Text::Balanced which is a core module for 5.8 (and probably earlier). If provides matching solutions for all sorts of common problems like this.
Ted
PS: It is quite possible you already new all of this and had left it out in the interest of brevity. :-)
| [reply] [d/l] [select] |
|
/" [^"]+ "/x;
"There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.
| [reply] [d/l] |
Re: \q quote-matching operator
by dragonchild (Archbishop) on Aug 26, 2004 at 12:22 UTC
|
You could use the single-quote as your -e delimiter. :-)
But, I do like your \q idea for matching quotes. (With the \Q non-quote characterclass, as well.) Go ahead and suggest it!
------
We are the carpenters and bricklayers of the Information Age.
Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose
I shouldn't have to say this, but any code, unless otherwise stated, is untested
| [reply] |
|
| [reply] [d/l] |
|
You could use the single-quote as your -e delimiter. :-)
You can? I seem to have trouble trying to do that.
C:\>perl -e 'print "foo\n"'
Can't find string terminator "'" anywhere before EOF at -e line 1.
In cmd.exe you can escape quotes with either a quote or a backslash, I don't know if either will work in command.com, If some one else doesn't know I'll check after I get home from work both do apear to work in command.com on Windows 98.
C:\>perl -e "print '""'"
"
C:\>perl -e "print '\"'"
"
| [reply] [d/l] [select] |
Re: \q quote-matching operator
by Anonymous Monk on Aug 26, 2004 at 15:17 UTC
|
| [reply] |
|
The idea was inspired because I thought it was impossible to do this on the Win32 command line. Now that I know that it is entirely possible, I still think that the idea has merit for matching any quote character. However, the namespace clash with \Q makes it rather counterintuitive.
| [reply] |
|
If you want to match any quoting character, \q isn't the way to go. [:quote:] would be more appropriate. But I think that the advantage is way too small that p5p would give it any consideration (rightly so, I'd say). Of course, if there would be a Unicode property for that group of characters, you could use \p{UNICODE PROPERTY HERE}.
| [reply] [d/l] [select] |
|