Re: search of a string in another string with 1 wildcard
by hippo (Archbishop) on Jul 09, 2014 at 14:05 UTC
|
What have you tried? If not String::Approx then that might be worth a look. You could specify there to be a maximum of 1 substitution with no insertions or deletions which I guess would fulfil your requirements.
The question is also sufficiently general that it says nothing about the use case. It could well be that there is a more appropriate method which relies on some domain knowledge. Without the context we cannot know about this.
| [reply] |
|
|
I get compilation problem
Can't locate String/Approx.pm in @INC (@INC contains: /etc/perl /usr/local/lib/perl/5.14.2 /usr/local/share/perl/5.14.2 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.14 /usr/share/perl/5.14 /usr/local/lib/site_perl .)
should I instal any thing?
| [reply] |
|
|
| [reply] |
Re: search of a string in another string with 1 wildcard
by ww (Archbishop) on Jul 09, 2014 at 14:39 UTC
|
So, don't search "abcdfg" [ :-) what happened to 'e' ? ]
But -- more seriously -- what do you mean by the statement that '"abcdfg", "fbcdfg", "aMcdfg", "abZdfg" etc are accepted' when only the first combination actually exists in your all-lower-case-string and the last two require upper case letters?
From the info in your question, there are several ways to solve your problem -- some quite easy (see the following list) -- but it's easier to avoid off-track suggestions if the question is unambiguous AND we've seen your code (for a hint at what you're actually trying to accomplish).
- character classes with quantifiers
- look behinds
- grouping
And various combinations of the above.
See On asking for help -- the "short version' at the top of that node will illuminate your next question.
Updated: Multiple edits of typos and formatting. Only the second para is actually new. Mea culpa. Brain seized in the heat.
| [reply] |
|
|
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
chomp(my $text = <>);
my @patterns = split ' ', <>;
my $threshold = 0 + <>;
my @ctext = split //, $text;
my @results;
for my $pattern (@patterns) {
my @cpat = split //, $pattern;
POSITION:
for my $pos (0 .. @ctext - @cpat) {
my $mismatches = 0;
for my $i (0 .. @cpat - 1) {
if ($cpat[$i] ne $ctext[$pos + $i]) {
next POSITION if ++$mismatches > $threshold;
}
}
push @results, $pos;
}
};
say join ' ', sort { $a <=> $b } @results;
| [reply] [d/l] |
Re: search of a string in another string with 1 wildcard
by Anonymous Monk on Jul 09, 2014 at 14:22 UTC
|
my $pattern = 'abcdef';
my @regexes;
for (my ($i, $len) = (0, length $pattern); $i < $len; ++$i) {
my $regex = $pattern;
substr($regex, $i, 1) = '.';
push @regexes, $regex;
}
say join "\n", @regexes;
Output:
.bcdef
a.cdef
ab.def
abc.ef
abcd.f
abcde.
Something like that? | [reply] [d/l] [select] |
|
|
(should be "push @regexes, qr/$regex/", of course)
| [reply] |
|
|
index(myString,@regexes);
?
Doesn't seem to work.
| [reply] [d/l] |
|
|
You need to loop through regexes. Or, maybe something like:
...
push @regexes, $regex;
}
my $r = join '|', @regexes;
$r = qr/($r)/; # compile the regex
say "Regex is: $r"; # debug
my $string_to_search = "djflsbcdefgkgjdslkgjabfoéabcdefg";
if ($string_to_search =~ $r) {
say "Found it ($1) at position ", $-[0];
}
# there is a useful magic variable @- (LAST_MATCH_START)
# check perldoc for it
Output:
Regex is: (?^u:(.bcdef|a.cdef|ab.def|abc.ef|abcd.f|abcde.))
Found it (sbcdef) at position 4
| [reply] [d/l] [select] |
|
|
|
|
substr($regex, $i, 1) = '.';
to
substr($regex, $i, m) = '.';
where m will be the user's free parameter?
| [reply] [d/l] [select] |
|
|
carolw:
Not quite. You're changing an $m character substring to a single char, so you could wind up with something like: .cdef, a.def, ab.ef, abc.f, abcd. where you're really wanting ..cdef, a..def, ab..ef, abc..f, abcd..; so you really want something a bit more like:
substr($regex, $i, $m) = '.' x $m;
But that's assuming you want your wildcards to be adjacent. If you want the wildcards to be anywhere, you've got a bit more work to do.
...roboticus
When your only tool is a hammer, all problems look like your thumb. | [reply] [d/l] [select] |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
I would like to slightly change the question:
How to modify the code, exactly '.' if the pattern to be matched is a fixed string and one character is not any character but can be a character in a set of characters at the same position:
$pattern = 'abcdef';
at the 3rd position, c could only be replaced by any character in the set {r,d,n,f,q,m}, for ex.
| [reply] |
|
|
c:\@Work\Perl>perl -wMstrict -le
"my $string = 'abcdef';
;;
my $pattern = qr{ [rdnfqm] }xms;
;;
print qq{matched '$1' at offset $-[1]}
if $string =~ m{ ($pattern) }xms;
"
matched 'd' at offset 3
The construct [rdnfqm] defines a "character class". Please see perlre, perlrequick, and perlretut.
| [reply] [d/l] [select] |
|
|
|
|
|
|
|
|
|
Re: search of a string in another string with 1 wildcard
by InfiniteSilence (Curate) on Jul 09, 2014 at 13:49 UTC
|
Sounds like you need to apply some form of pattern recognition. A crude method could (presuming that what you meant to say was that at minimum some number of your desired characters must exist in the target),
- Start with any six letter characters
- Identify if ANY of your target letters are in the six, if not, move to the next six characters
- If so, determine if that match meets your minimum requirements for a successful match, store and report the find
Fortunately (or unfortunately depending upon your perspective), pattern recognition in strings is quite an involved subject. You might consider shopping around for some algorithms to either implement or search CPAN for.
Celebrate Intellectual Diversity
| [reply] |
|
|
| [reply] |