Re: more efficient regular expression please
by hardburn (Abbot) on Jul 03, 2003 at 14:32 UTC
|
Probably slower than what you're using, but easier to understand (I hope):
for(my $str = 'vaction.msg'; $str; $str = substr($str, 1, length($str)
+)) {
$replace_on =~ s/$str//;
}
Ahh, the rarely seen (around here, anyway) three-element form of for. I've tested this loop and it produces the strings correctly. I don't know if this exactly matches your needs, but you should be able to modify it easily enough.
Update: Forgot that you need to escape the '.' before it gets put in a regex, which complicates things a bit.
---- I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
-- Schemer
Note: All code is untested, unless otherwise stated
| [reply] [d/l] [select] |
|
|
Forgot that you need to escape the '.' before it gets put in a regex, which complicates things a bit.
Not by that much. All you need to add, is '\Q':
$replace_on =~ s/\Q$str//;
I won't comment on the actual value of your idea, though. :) | [reply] [d/l] |
Re: more efficient regular expression please
by l2kashe (Deacon) on Jul 03, 2003 at 15:10 UTC
|
So im not sure what you mean by more efficient. In terms of runtime, or in terms of regex construction. I also dont know what you mean by any pieces of its name, or rather how paranoid you want to be. but this is what I have.
#!/usr/bin/perl
use strict;
my $str;
my $tomatch = 'foobar the vacation.msg is here';
my @permute = map {
$str; $str = $_ . $str;
} ( reverse( split(//, 'vacation') ) );
$regex = '(\.?(?:' . join('|', reverse(@permute)) . ')\.msg)';
print "$regex\n";
print "Match: $1\n" if ($regex =~ m/$tomatch/ );
HTH
MMMMM... Chocolaty Perl Goodness..... | [reply] [d/l] |
Re: more efficient regular expression please
by BrowserUk (Patriarch) on Jul 03, 2003 at 16:04 UTC
|
Assuming that I haven't screwed up again, which is by no means guarenteed, your regex doesn't quite catch all the cases.
Input cation.msgThe quick brown fox jumps over the lazy dog
Output caThe quick brown fox jumps over the lazy dog
However, that could be corrected. And the sad truth is that your version actually runs a tad faster than my (second) attempt, and also tye's offering, though the difference is marginal, and may fade once you correct the error.
The only saving grace is that mine is shorter and prettier, but then tye's is even shorter and prettier than mine, and runs a tad quicker to boot.
I'm taking a vacation as of now, from PM at least. It would be nice to take a real one, but that's not an option. I might be back someday.
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
| [reply] [d/l] [select] |
Re: more efficient regular expression please
by BrowserUk (Patriarch) on Jul 03, 2003 at 15:01 UTC
|
This seems to do what you want, if I understand you correctly.
$line =~ s[v?a?c?a?t?i?o?n\.msg][];
Update: Right idea, wrong execution. See 271190 below :( I think this is better.
$line =~ s[(?:(?:(?:(?:(?:(?:v?a)?c)?a)?t)?i)?o)?n.msg][];
Test code
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
| [reply] [d/l] [select] |
|
|
$line =~ s[v?(a?(c?(a?(t?(i?(o?(n?(\.?(m?(s?g))))))))))][];
- tye | [reply] [d/l] |
|
|
Your right. It does need brackets. I had another go similar in vein to yours, though I bracketed the other way. Both seem to work, with the same caveat that in the partial insertion is preceded by one or more characters that match the missing part, they get removed also, but without some delimeter, this will always be the case.
Any thought about which bracketing causes the least amount of work?
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
| [reply] |
|
|
oops!
last two lines of output
The quick brown fox jumps oer the lazy dog
The quick brown fox jumps over the lazy dog
Where did that 'v' go? | [reply] [d/l] |
|
|
#! perl -slw
use strict;
# Set up some test data.
my $text = 'The quick brown fox jumps over the lazy dog';
my @tests = map{
my $t = $text;
substr($t, rand( length $text ), 0 ) = $_;
$t;
} map{
substr 'vacation.msg', $_
} 0 ..6;
print "Before\n";
print for @tests;
#
s[(?:(?:(?:(?:(?:(?:v?a)?c)?a)?t)?i)?o)?n.msg][] for @tests;
print "\nAfter\n";
print for @tests;
__END__
P:\>junk
Before
The quick brown fox jumps ovvacation.msger the lazy dog
The quick brown fox jumps oacation.msgver the lazy dog
The quick brocation.msgwn fox jumps over the lazy dog
The quick brown fox jumpsation.msg over the lazy dog
The quick brown fox jumps over the lation.msgzy dog
Tion.msghe quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy on.msgdog
After
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lzy dog
The quick brown fox jumps over the lazy dog
The quick brown fox jumps over the lazy dog
This time, the apparent error is explainable.
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
| [reply] [d/l] |
Re: more efficient regular expression please
by CountZero (Bishop) on Jul 04, 2003 at 13:20 UTC
|
Rather than immediately jumping into churning out code, we should perhaps ask why one should also look at "any pieces of its name"? Is it perhaps because this string might be split over two lines? If that is the case, than I can imagine that simpler solutions are available. CountZero "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law
| [reply] |