in reply to Re^2: search of a string in another string with 1 wildcard
in thread search of a string in another string with 1 wildcard

carolw:

Not quite. You're changing an $m character substring to a single char, so you could wind up with something like: .cdef, a.def, ab.ef, abc.f, abcd. where you're really wanting ..cdef, a..def, ab..ef, abc..f, abcd..; so you really want something a bit more like:

substr($regex, $i, $m) = '.' x $m;

But that's assuming you want your wildcards to be adjacent. If you want the wildcards to be anywhere, you've got a bit more work to do.

...roboticus

When your only tool is a hammer, all problems look like your thumb.

Replies are listed 'Best First'.
Re^4: search of a string in another string with 1 wildcard
by carolw (Sexton) on Jul 19, 2014 at 19:08 UTC

    In effect, the wildcards could be anywhere.

    why in your previous code there is x before $m?

      carolw:

      The substr($regex, $i, $m) tells us to get $m characters from the string $regex at position $i. When you treat substr() as a lvalue (i.e., put it on the left side of the equals sign), you're telling perl to replace that substring with what's on the other side.

      Since you're replacing $m characters, with a '.', you'll lose $m-1 characters (assuming $m is larger than 1). The x is a "repeat" operator, so "." x 5 creates a string of five periods. So we're using '.' x $m to create a string composed of periods as long as the substring you're replacing, so you don't change the length of the string.

      ...roboticus

      When your only tool is a hammer, all problems look like your thumb.

Re^4: search of a string in another string with 1 wildcard
by carolw (Sexton) on Jul 27, 2014 at 14:53 UTC

    So if the length of my string to be searched is about 100 chars or more which is not a lot but I have a very large number of strings in which I want to search another substring (the same in all) like millions or may be more and more than 1 non-adjacent wildcard will be permitted, will it be efficient in terms of time and memory to use your code (of course it should be adapted for more than 1 non-adjacent wildcard) or String::Approx?

      carolw:

      I don't really know about the efficiency, I was just trying to show a way that you can build the strings so the lengths don't unexpectedly change. You'll have to benchmark different ways if you want to find the most efficient.

      ...roboticus

      When your only tool is a hammer, all problems look like your thumb.

Re^4: search of a string in another string with 1 wildcard
by carolw (Sexton) on Jul 28, 2014 at 18:59 UTC

    To extend to any number of non-adjacent wildcards, is it a good idea to put the 2 lines in a loop or would it be better to do in another way?

    for (j in 1:wildcards_nb){ #where wildcards_nb is the user's free para +meter substr($regex, $i, 1) = '.'; push @regexes, $regex; }

      carolw:

      It depends. I've just re-read this thread, and I can't quite tell exactly where you're going with this. If you'd explain what you're doing in a little more detail, it would be easier to suggest an approach.

      ...roboticus

      When your only tool is a hammer, all problems look like your thumb.

Re^4: search of a string in another string with 1 wildcard
by carolw (Sexton) on Jul 29, 2014 at 09:05 UTC

    Well, I have a large number of strings of 100-char length or more and would like to search for a substring with m wildcards of mismatch (m >= 1, user's free parameter) in all of the strings.

    So I started the thread with 1 wildcard but then, realized that the number of wildcards should be any number >=1.