Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

I asked in the chatterbox about multidimensional regular expressions and was given a referral to a clean white room with rubber walls. I'm hoping that by putting this into a proper node that the question will get wider exposure and maybe someone has already implemented this. I did a cpan/google search and didn't come up with anything that really matches. At its heart a regular expression is just a series of concise set of assertions joined by AND/OR junctions (with side effects). There's no particular reason that it doesn't exist already. All I can think of is that no one has needed it before. It'd be interesting to take a whack at the idea. (this is where you post a pointer to th e right CPAN module)

Consider first that something m/(?:this|that)/ is a basic one dimensional expression. An AoA is a two dimensional expression, AoAoA is three dimensional, etc. (just for kicks think about locally expanded dimensions or what it'd mean to mix hashes in). I'm not even sure how to best describe the grammar for this but then if someone else has already implemented it then I don't have to (I don't have to anyway but that's not the point). I'm initially thinking of something like \[{ ... } and \]{ ... } (reading the overload section of perlre leads me to think this syntax is ok). So before I get any further with this I'd like to know if this already exists somewhere or if it doesn't exist for a reason. I'm also not sure if there is a way to rebind the currently executing regular expression to another string. I can work around that by use of (??{}) but I'd rather just not resort to that hack.

This is very contrived example for how this might be used. So far most of the data I work with is distinguished by being in different fields which sort of removes the point to this technique. In general though - imagine you were going to do a pattern match against a multidimensional bitmap. I /think/ this has applications there. Or maybe not. It's an idea anyway and if it's just oddball I'm interested in hearing why.

Update 0: It occurs to me that you all might more mileage out of this if I explain my original inspiration. A few weeks ago John M. Dlugosz was talking about unifying substr, splice, shift, unshift and other array functions with the string functions. The problem is, once you start treating strings like arrays then people like me start wanting to treat arrays like strings which is why this even occured to me.

Update 1: I think the main problem with this is rebinding the running regex with another string. You can play tricks like (?(?{more expressions...})continue in this expression|(?#fail)(?!)) but that doesn't quite strike me as a good idea.

Update 2: Taking into consideration both merlyn and my response to princepawn I think the basis of this ought to be a metasequence like \[{dimension,direction} for switching into a different dimension (like a tangled ball of string) and \]{//xpath/expression} for the original idea of skipping around in the data. The first metasequence is probably the most conservative in that all it's doing is adding right angles to regex. The second metasequence is more interesting in that it would allow you to specify a location to jump to. Perhaps somewhat like setting pos() while in the middle of an expression

Update 3: I'd just like to note that the use of the sequences \[{...} and \]{...} is entirely arbitary and just based off of \N{...}. If you have a better syntax then please speak up.

@matches = [ "LISTOP", "OP", "COP", "BINOP", [ "LOOP", [ "OP", "UNOP", [ "OP", "UNOP", [ "SVOP" ], ], ], "UNOP", [ "LOGOP", [ "OP", "LISTOP", [ "COP", "LISTOP", [ "OP", "UNOP", [ "SVOP" ], ], "OP", "COP" ], ], ], ], ] =~ m[(SVOP\[{[-1,-1]}UNOP)]g; print Dumper(\@matches); $VAR => [# match 0 [ "UNOP", [ "SVOP" ] ], # Match 1 [ "UNOP", [ "SVOP" ] ] ]; # empty those end SVOP strings s[(?<=UNOP\[{[0,1],[1,1]})SVOP][];

In reply to Multidimensional regular expressions by diotalevi

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (2)
As of 2023-01-29 10:00 GMT
Find Nodes?
    Voting Booth?

    No recent polls found