Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: Regexes: finding ALL matches (including overlap)

by nobull (Friar)
on Jun 04, 2005 at 09:06 UTC ( [id://463496]=note: print w/replies, xml ) Need Help??


in reply to Regexes: finding ALL matches (including overlap)

I gave a talk about this amongst other things at YAPC::Europe::2004. This question started at slide 20.
I would want "abcdef" =~ m/..*..*./g to return 20 = 6 choose 3 matches.

Hmmm... that's not quite the same thing I was talking about. How is Perl to know that .* is different from . ?

As far as my solution (actually largely due to abigail) is concerned /..*..*../ is simply /.{4,}/ and all matches thereof in "abcdef" would be 6..

  • substr("abcdef",0,4)
  • substr("abcdef",0,5)
  • substr("abcdef",0,6)
  • substr("abcdef",1,4)
  • substr("abcdef",1,5)
  • substr("abcdef",2,4)
Update: changed /.*/ to /.{4,}/ and made resulting changes.

Replies are listed 'Best First'.
Re^2: Regexes: finding ALL matches (including overlap)
by demerphq (Chancellor) on Jun 04, 2005 at 17:15 UTC

    /..*..*../ is simply /.*/

    That strikes me as somewhat odd. The pattern on the right can match a string of less than 4 characters, the pattern of the left can not.

    ---
    $world=~s/war/peace/g

      Yes thanks. Updating the previous node.
Re^2: Regexes: finding ALL matches (including overlap)
by kaif (Friar) on Jun 04, 2005 at 17:03 UTC
    I would want "abcdef" =~ m/..*..*./g to return 20 = 6 choose 3 matches.
    Hmmm... that's not quite the same thing I was talking about. How is Perl to know that .* is different from . ?
    Easy: imagine I was matching m/\w.*\w.*\w/g instead. There really is no other possibility than have this return 20 matches (each \w has to match one of the 6 letters). Here are some more examples of what I would want (assuming I made no mathematical mistakes):
    • "abcdef" =~ m/..*..*./g   returns 20 = 6 choose 3
    • "abcdef" =~ m/.*/g   returns 28 = (6+2) choose 2 = number of substrings of length 6 string
    • "abcdef" =~ m/....*/g   returns 10 = number of length 3 or greater substrings of length 6 string
    • "abcdef" =~ m/^.*$/g   returns 1
    • "abcdef" =~ m/^.*.*$/g   returns 7 = number of ways of splitting a length 6 string into two parts

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://463496]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (6)
As of 2024-04-19 14:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found