more efficient regular expression please

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: more efficient regular expression please by hardburn (Abbot) on Jul 03, 2003 at 14:32 UTC
Probably slower than what you're using, but easier to understand (I hope): `for(my $str = 'vaction.msg'; $str; $str = substr($str, 1, length($str) +)) { $replace_on =~ s/$str//; }` [download] Ahh, the rarely seen (around here, anyway) three-element form of `for`. I've tested this loop and it produces the strings correctly. I don't know if this exactly matches your needs, but you should be able to modify it easily enough. Update: Forgot that you need to escape the '.' before it gets put in a regex, which complicates things a bit. ---- I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident. -- Schemer Note: All code is untested, unless otherwise stated	[reply] [d/l] [select]
Re: Re: more efficient regular expression please by bart (Canon) on Jul 03, 2003 at 19:57 UTC
Forgot that you need to escape the '.' before it gets put in a regex, which complicates things a bit. Not by that much. All you need to add, is '\Q': `$replace_on =~ s/\Q$str//;` [download] I won't comment on the actual value of your idea, though. :)	[reply] [d/l]
Re: more efficient regular expression please by l2kashe (Deacon) on Jul 03, 2003 at 15:10 UTC
So im not sure what you mean by more efficient. In terms of runtime, or in terms of regex construction. I also dont know what you mean by any pieces of its name, or rather how paranoid you want to be. but this is what I have. `#!/usr/bin/perl use strict; my $str; my $tomatch = 'foobar the vacation.msg is here'; my @permute = map { $str; $str = $_ . $str; } ( reverse( split(//, 'vacation') ) ); $regex = '(\.?(?:' . join('\|', reverse(@permute)) . ')\.msg)'; print "$regex\n"; print "Match: $1\n" if ($regex =~ m/$tomatch/ );` [download] HTH MMMMM... Chocolaty Perl Goodness.....	[reply] [d/l]
Re: more efficient regular expression please by BrowserUk (Patriarch) on Jul 03, 2003 at 16:04 UTC
Assuming that I haven't screwed up again, which is by no means guarenteed, your regex doesn't quite catch all the cases. Input `cation.msgThe quick brown fox jumps over the lazy dog` Output `caThe quick brown fox jumps over the lazy dog` However, that could be corrected. And the sad truth is that your version actually runs a tad faster than my (second) attempt, and also tye's offering, though the difference is marginal, and may fade once you correct the error. The only saving grace is that mine is shorter and prettier, but then tye's is even shorter and prettier than mine, and runs a tad quicker to boot. Read more... (3 kB) I'm taking a vacation as of now, from PM at least. It would be nice to take a real one, but that's not an option. I might be back someday. Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller	[reply] [d/l] [select]
Re: more efficient regular expression please by BrowserUk (Patriarch) on Jul 03, 2003 at 15:01 UTC
This seems to do what you want, if I understand you correctly. `$line =~ s[v?a?c?a?t?i?o?n\.msg][];` Update: Right idea, wrong execution. See 271190 below :( I think this is better. `$line =~ s[(?:(?:(?:(?:(?:(?:v?a)?c)?a)?t)?i)?o)?n.msg][];` Test code Read more... (2 kB) Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller	[reply] [d/l] [select]
Re^2: more efficient regular expression please (parens) by tye (Sage) on Jul 03, 2003 at 15:11 UTC
That also removes cat.msg, vain.msg, etc. Closer to the original intent (as best as I can tell) would be: `$line =~ s[v?(a?(c?(a?(t?(i?(o?(n?(\.?(m?(貞?g))))))))))][];` [download] - tye	[reply] [d/l]
Re: Re^2: more efficient regular expression please (parens) by BrowserUk (Patriarch) on Jul 03, 2003 at 15:33 UTC
Your right. It does need brackets. I had another go similar in vein to yours, though I bracketed the other way. Both seem to work, with the same caveat that in the partial insertion is preceded by one or more characters that match the missing part, they get removed also, but without some delimeter, this will always be the case. Any thought about which bracketing causes the least amount of work? Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller	[reply]
Re: Re: more efficient regular expression please by Not_a_Number (Prior) on Jul 03, 2003 at 15:09 UTC
oops! last two lines of output `The quick brown fox jumps oer the lazy dog The quick brown fox jumps over the lazy dog` [download] Where did that 'v' go?	[reply] [d/l]
Re: Re: Re: more efficient regular expression please by BrowserUk (Patriarch) on Jul 03, 2003 at 15:23 UTC
Your right. It was flawed. Another attempt. #! perl -slw use strict; # Set up some test data. my $text = 'The quick brown fox jumps over the lazy dog'; my @tests = map{ my $t = $text; substr($t, rand( length $text ), 0 ) = $_; $t; } map{ substr 'vacation.msg', $_ } 0 ..6; print "Before\n"; print for @tests; # s[(?:(?:(?:(?:(?:(?:v?a)?c)?a)?t)?i)?o)?n.msg][] for @tests; print "\nAfter\n"; print for @tests; __END__ P:\>junk Before The quick brown fox jumps ovvacation.msger the lazy dog The quick brown fox jumps oacation.msgver the lazy dog The quick brocation.msgwn fox jumps over the lazy dog The quick brown fox jumpsation.msg over the lazy dog The quick brown fox jumps over the lation.msgzy dog Tion.msghe quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy on.msgdog After The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lzy dog The quick brown fox jumps over the lazy dog The quick brown fox jumps over the lazy dog [download] This time, the apparent error is explainable. Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller	[reply] [d/l]
Re: more efficient regular expression please by CountZero (Bishop) on Jul 04, 2003 at 13:20 UTC
Rather than immediately jumping into churning out code, we should perhaps ask why one should also look at "any pieces of its name"? Is it perhaps because this string might be split over two lines? If that is the case, than I can imagine that simpler solutions are available. CountZero "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law	[reply]