Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: Problem with regex wildcard operator (.)...?

by LanX (Sage)
on Sep 05, 2021 at 20:26 UTC ( #11136477=note: print w/replies, xml ) Need Help??


in reply to Problem with regex wildcard operator (.)...?

> As you can see the wildcard behaves erratically.

Sorry I don't see your problem, could you please describe it better and provide a SSCCE ?

> Will you humor me one last time?

We are happy to help, as long as you don't make it unnecessary difficult for us. :)

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery

  • Comment on Re: Problem with regex wildcard operator (.)...?

Replies are listed 'Best First'.
Re^2: Problem with regex wildcard operator (.)...?
by sbrothy (Acolyte) on Sep 05, 2021 at 20:38 UTC

    Certainly. I realize it's probably tldr. Let me give it another go before you waste your time...

      OK, I shortened it, hope this is more acceptable. In the process I discovered, as I suspected, that it has something to do with the regex order, but exactly what is still pretty opaque to me.

      #!/usr/bin/perl my @words = ('aa', 'as', 'es', 'is', 'os', 'sh', 'si', 'so'); print "WORDS: @words\n"; print "----------------------------\n"; my $regex = ".?a?"; print "Matches for regex pattern : $regex\n\n"; foreach (@words) { if(join('', sort split //, $_) =~ /^$regex$/) { print "$_\n"; } } print "----------------------------\n"; my $regex = "a?.?"; print "Matches for regex pattern : $regex\n\n"; foreach (@words) { if(join('', sort split //, $_) =~ /^$regex$/) { print "$_\n"; } } print "----------------------------\n"; $regex = "s?.?"; print "Matches for regex pattern : $regex\n\n"; foreach (@words) { if(join('', sort split //, $_) =~ /^$regex$/) { print "$_\n"; } } print "----------------------------\n"; $regex = ".?s?"; print "Matches for regex pattern : $regex\n\n"; foreach (@words) { if(join('', sort split //, $_) =~ /^$regex$/) { print "$_\n"; } } print "----------------------------\n";

      OUTPUT

      WORDS: aa as es is os sh si so ---------------------------- Matches for regex pattern : .?a? aa ---------------------------- Matches for regex pattern : a?.? aa as ---------------------------- Matches for regex pattern : s?.? ---------------------------- Matches for regex pattern : .?s? as es is os sh si so ----------------------------

      Regards.

        This is what you should get:

        words: aa as es is os sh si so string: aa as es is os hs is os ^.?a?$: y n n n n n n n aa ^a?.?$: y y n n n n n n aa as ^s?.?$: n n n n n n n n ^.?s?$: n y y y y y y y as es is os sh si so

        That's exactly what you get. What did you expect to be different?

        Seeking work! You can reach me at ikegami@adaelis.com

        You should also tell us what you expected and where it differs ..

        I think you are trying to resort the characters in the input-strings alphabetically (why?) maybe print those temp strings too for debugging purpose.

        I'm no Scrabble player and I'm not sure why you do what you do.

        What's the concept here?

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery

        ... it has something to do with the regex order ...

        The regexes actually being used are all anchored with ^ $ assertions (see perlre, perlretut). That means that certain characters have to appear at the beginning or end of certain strings for a match to occur.

        I'm also unsure of just what you expect to get, but consider this code:

        Win8 Strawberry 5.8.9.5 (32) Sun 09/05/2021 22:17:31 C:\@Work\Perl\monks >perl -Mstrict -Mwarnings use Data::Dump qw(dd); my @words = ( # added a few extra 'words' '', 'x', 'a', 's', 'aa', 'as', 'es', 'is', 'os', 'sh', 'si', 'so' ); dd 'WORDS:', \@words; print "----------------------------\n"; for my $regex (".?a?", "a?.?", "s?.?", ".?s?",) { print "Matches for ACTUAL regex pattern : ^$regex\$\n\n"; foreach my $word (@words) { my $sorted = join '', sort split //, $word; if ($sorted =~ /^$regex$/) { print "'$sorted' -> '$word' "; } } print "\n----------------------------\n"; } ^Z ( "WORDS:", ["", "x", "a", "s", "aa", "as", "es", "is", "os", "sh", "si", "so"], ) ---------------------------- Matches for ACTUAL regex pattern : ^.?a?$ '' -> '' 'x' -> 'x' 'a' -> 'a' 's' -> 's' 'aa' -> 'aa' ---------------------------- Matches for ACTUAL regex pattern : ^a?.?$ '' -> '' 'x' -> 'x' 'a' -> 'a' 's' -> 's' 'aa' -> 'aa' 'as' -> 'a +s' ---------------------------- Matches for ACTUAL regex pattern : ^s?.?$ '' -> '' 'x' -> 'x' 'a' -> 'a' 's' -> 's' ---------------------------- Matches for ACTUAL regex pattern : ^.?s?$ '' -> '' 'x' -> 'x' 'a' -> 'a' 's' -> 's' 'as' -> 'as' 'es' -> 'e +s' 'is' -> 'is' 'os' -> 'os' 'hs' -> 'sh' 'is' -> 'si' 'os' -> ' +so' ----------------------------
        Note that I've added '' (empty string) 'x' 'a' 's' to the @words array. Note also that all the regexes in question match all these new strings.

        The complete regex ^.?a?$ matches any zero- or one-character string. The regex can match some two-character strings. For such a match, an 'a' must be at the absolute end of the string (or before a newline at the end of the string). There's only one two-character string in @words that, after the string is sorted, matches this regex: 'aa'. (Update: Neither this regex nor any of the others discussed can match a string of more than two characters.)

        The regex ^a?.?$ matches any zero- or one-character string and some two-character strings. For a two-character match, an 'a' must be at the absolute beginning of the string. There are two, two-character strings in @words that, after being sorted, match this regex: 'aa' 'as'.

        The regex ^s?.?$ doesn't match any two-character strings because no string in @words, after being sorted, begins with an 's'.

        The regex ^.?s?$ matches almost every two-character string in @words because almost every such string, after being sorted, ends in 's'.


        Give a man a fish:  <%-{-{-{-<

        I am still not "getting it" in terms of your input syntax and how it should match against the dictionary.

        I recently wrote a "cheater program" for a less sophisticated game, but perhaps with some similar matching requirements?

        In my game, I have a set of "tiles", letters than I can use to form words. I cannot use any letter other than one of these letters to form words. There are not any 2 letter words in my dictionary. Your dictionary has a lot of weird words, but I guess that is how scrabble is!

        Your example is a bit hard for me to figure out since, es,os,sh,si are not "normal English words" although as,is,so are. Likewise "aa" is not a normal English word, although "as" is. There are words in the scrabble dictionary that I've never read, heard or used.

        What I'd like to see is a user session that shows how your scrabble cheat game works. I will attach a user example from my cheat program for another game. Once we understand what the code is supposed to do, then at least I would be interested in helping you create a "blackbelt" level cheat program - just for the fun of it!

        Here is an example of the kind of example that I want from you:

        Example 1: leading ";" means this is a list of letters instead of a pattern list of letters or pattern: ;lolewf list of letters or pattern: --- The --- says: show me all 3 letter words that can be formed from my letter "universe". elf ell few foe fol low owe owl woe # Example 2: # Show me all 4 letter words which can be formed from # the "list of letters" that have "l" as the 3rd letter list of letters or pattern: ;odlswe # List of Letters list of letters or pattern: --l- # A search pattern dole owls sold sole weld wold
        I don't know really anything about Scrabble. But I think you will have to develop a syntax to input words that are on the board already. I think if some guy plays "pagan", you can play "ize" to turn this into "paganize" and score big points. In addition, each letter has some kind of "value", playing a "z" is worth more than playing an "a". I suspect that you are going to want a word sort order that instead of being alphabetical, is based upon the point value of the word?

        I think your tr statement could be better:
        tr is a very stupid thing and it does not use character sets. I am not sure why your code seems to work.

        use strict; use warnings; my $string = '\abCD'; $string =~ tr/\[A-Z]/[a-z]/; #should be tr/A-Z/a-z/? print "$string\n";

        I'm no longer sure. I've had a beer by now so I'll look at it tomorrow. Thank you for your time nonetheless. I'll reread the perldocs on regexes before I bother you again. Regards.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11136477]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (3)
As of 2022-09-25 21:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    I prefer my indexes to start at:




    Results (116 votes). Check out past polls.

    Notices?