Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: Re: A demanding parser

by gmax (Abbot)
on Jan 26, 2002 at 18:56 UTC ( [id://141785]=note: print w/replies, xml ) Need Help??


in reply to Re: A demanding parser
in thread A demanding parser

Thanks for the tip. I am not sure I understand how to use it, though.
My purpose, as you have pointed out, is to replace Regexp::Common with some normal Perl RegEx. By normal I mean a non-module dependant expression.
As for the motivation, you guessed right that it's related to education. Personally, I wouldn't bother. I need to distribute this module as part of a more extensive educational material aiming at the build-up of a huge database. I would like to avoid pointing to a CPAN module, since many people in the audience are not experienced Perl users. They should just copy this module to their computers and execute the import/export script.
Of course I can provide them with a copy of the module, or instruct them to connect to the CPAN, download the module and install it, or use "perl -MCPAN -e shell" but it would steal valuable time from my lectures.

That aside, here is a test script for your RegEx, which does not seem to give me what I want.
Was it my misunderstanding, or were you trying to show me how to catch the inner parenthesized text only?
#!/usr/bin/perl -w use strict; use Regexp::Common; my $re = qr{ \( (?: (?> [^()]+ ) | (??{ $re }))* \) }x; my $input = "aa bb cc (dd ee (ff gg (hh) jj) kk)"; print "With module\n"; while ($input =~ m/(\w+|$RE{balanced}{-parens=>'()'})\s*/g) { print "$1\n"; } print "With recursive RegExp\n"; while ($input =~ m/(\w+|$re)\s*/g) { print "$1\n"; } __END__ # output: With module aa bb cc (dd ee (ff gg (hh) jj) kk) With recursive RegExp aa bb cc dd ee ff gg (hh) jj kk
update
Found the problem. Recursive RegExes don't work properly with use strict
Changing
my $re = qr{ \( (?: (?> [^()]+ ) | (??{ $re }))* \) }x;
into
no strict 'vars'; $rec_re = qr{ \( (?: (?> [^()]+ ) | (??{ $rec_re }))* \) }x; my $re = $rec_re; use strict;
makes the same output from both regexes.
 _  _ _  _  
(_|| | |(_|><
 _|   

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://141785]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (5)
As of 2024-03-29 14:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found