My first question comes from looking at the two together. Your regex describes some of the following strings:Regex is like ((AND|OR)([!=><]+)(.*))+ Input is like $check = "AND=>1536463OR<foobarOR=5";
I think my point is taken. :-) So then we look at the data. You didnt really say what was supposed to happen. IS this supposed to produce the following tripletsAND!==!<!>>!!LKJKJIOJJ182873KLJJyuukjljkOR!<><><><><=!Blah OR==!=!Hmm, could this be right?AND>>>>>>>this could be a problem
Or was it supposed to reject it? (Its not clear from the conversation I saw on the chatterbox, nor from your post)AND,=>,1536463 OR,<,foobar OR,=,5
The way to solve this is figure out what the dot SHOULDNT match. Ie it shouldnt match the above regex combined together, (AND|OR)(=[><=]?|[><]=?|!=|<>), although we dont want to invoke capture buffers so we use (?:) instead of (), because that would be a new token. So we have to make sure char by char that we dont match that pattern. So the inner layer looks like:
(?!(?:AND|OR)(?:=[><=]?|[><]=?|!=|<>)).
We then wrap that again to say 1 or more of the above..
(?:(?!(?:AND|OR)(?:=[><=]?|[><]=?|!=|<>)).)+
and then again to capture it
((?:(?!(?:AND|OR)(?:=[><=]?|[><]=?|!=|<>)).)+)
We put the three parts together and we get
Note the OR in my version of your example. The rgex does not trip up over this because we made the negative lookahead assertion include the => coditional part as well.$_ = "OR=5AND=>1536463OR<foORobarOR=5 "; while (m/(AND|OR) #either AND or OR (=[><=]?|[><]=?|!=|<>) #one of = => =< == > < >= .... ( #capture all within... (?: # group for quantifier (?! # not followed by (?:AND|OR) # AND or OR (?:=[><]?|[><]=?|[!=]=)# one of = => =< ... ) # any of the inside . # match any char.. )+ # 1 or more of the above ) #and return it.. /xgms) { #ignore spaces, repeated, #multiline, . matches all # and if it all worked out then... print "$1 $2 $3\n"; } # outputs # OR = 5 # AND => 1536463 # OR < foORobar # OR = 5
Hope this helps
Yves
--
You are not ready to use symrefs unless you already know why they are bad. -- tadmc (CLPM)
Update
LiTinOveWeedle asked for help enhancing this so that the script will match some of the odder relational operators.
my $opers='=[><=!]?|[><!]=|<>|[<>]'; while (m/(AND|OR)($opers)((?:(?!(?:AND|OR)(?:$opers)).)+)/xgms) { print "$1 $2 $3\n"; }
In reply to Re: Re: Parsing with regex
by demerphq
in thread Parsing with regex
by LiTinOveWeedle
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |