irom has asked for the wisdom of the Perl Monks concerning the following question:
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Hash/Array of Regular Expressions?
by bikeNomad (Priest) on Jun 23, 2001 at 20:52 UTC | |
| [reply] [d/l] [select] |
by deprecated (Priest) on Jun 24, 2001 at 18:00 UTC | |
I'm curious why you chose to use map here. I've used arrays of qr!! (Parse Loops with flat text files. (code)), and I also think I have a good sense of when map is appropriate (Sort a long list of hosts by domain (code)). But in some cases, using map seems to be either obfuscation or bloat for the sake of shorter code. Observe: Isn't there additional overhead from calling map? Some people are frenzied map lovers. And in some cases, it is the most appropriate way to do something. I just dont see why you used it here. Enlighten me? brother dep. --
| [reply] [d/l] [select] |
by bikeNomad (Priest) on Jun 24, 2001 at 20:19 UTC | |
I used map because:<bl> | [reply] [d/l] [select] |
by risacher (Beadle) on Jun 24, 2001 at 20:41 UTC | |
Since you've called me out by name, I guess I have to respond. And I wouldn't call myself a "frenzied map lover". Like any function, map has it's place. When is the proper time to use map? I would argue that one should use map whenever you want to apply an expression to all the members of an array, and actually intend to use the array of results. Map is contraindicated when you're not going to use the result... use foreach in that case; map adds extra overhead compared to foreach if you aren't going to use the result. (that's the only additional overhead that I know of.) In this case, bikeNomad is using the result, and the code is therefore concise, and correct. Dep, you haven't shown any reason why map is a bad idea in this case. I think that clarity here is achieved by separating out the array of regexps, so that they are visually distinct and clear. Then, the code that maps that with qr() is also visually separate. I think this is a good thing. Since you implied you wanted it, here's my stylistic criticism of your alternatives: @array = ( qr{^abcd}, qr{cd[ef]g}, qr{cat$} );Putting a qr{} around each search term is terrible, IMHO. If you had a list with many search terms, it would result in much more typing. Even with a few terms, it means that each search expression that the author is trying to express is wrapped in a little bit of ugliness. (I do appreciate your use of qw{} elsewhere, to reduce quotes.) foreach ( qw{ ^abcd cd[ef]g cat$ } ) { push @array, qr{$_} }This isn't bad, but recommending it is the same as saying that map() shouldn't exist, since it's exactly the same, except with more typing. push @array, qr{$_} for qw{ ^abcd cd[ef]g cat$ };This is worst of all, I think, because it relies on the wierd semantic order of things in perl that few other languages implement (like putting the loop conditions after the loop body). Don't get me wrong, I think that sort of thing is cool, and is great for a some circumstances. But really, the point of the backwards syntax is to make perl read more like English. I'd rather my perl code read like C or TCL or lisp than English. Those types of constructs are exactly the sorts of things that make perl hard to read for novice perl programmers. The fact that I can iterate the push after-the-fact like you suggest here is non-obvious to someone coming from another language. Even someome familiar with perl might wonder, whether the precedence rules will do what you want. As it turns out, your code is correct, of course. But someone could easily read it as meaning something like: push @array, { qr{$_} } for qw{ ^abcd cd[ef]g cat$ };, which, of course, is wrong. My own stylistic fetishes, aside, you never said what you thought was wrong with using map. What is it that you object to? | [reply] [d/l] [select] |
by Aighearach (Initiate) on Jun 25, 2001 at 01:37 UTC | |
Map has lower overhead than many other list changing algorithms... this is mostly because it uses better, faster, fewer temporary variables. Lets comapare using our good old friend Devel::OpProf.
The output: *** map *** null operation 10005 constant item 10001 scalar variable 10000 map iterator 10000 multiplication (*) 10000 block 10000 pushmark 4 next statement 2 private array 2 list assignment 1 map 1 subroutine entry 1 glob value 1 *** foreach *** null operation 20005 pushmark 20002 next statement 20002 glob value 10002 logical and (&&) 10001 private array 10001 constant item 10001 foreach loop iterator 10001 iteration finalizer 10000 multiplication (*) 10000 scalar dereference 10000 list assignment 10000 foreach loop entry 1 subroutine entry 1 loop exit 1 *** for *** next statement 10003 glob value 10002 pushmark 10002 logical and (&&) 10001 private array 10001 constant item 10001 foreach loop iterator 10001 multiplication (*) 10000 push 10000 iteration finalizer 10000 scalar dereference 10000 null operation 5 foreach loop entry 1 subroutine entry 1 loop exit 1 So we see that, a map has less action than a foreach, and stuffing the for in the push is almost as good as a map, and with many of the same operations going on.
| [reply] [d/l] |
|
Re: Hash/Array of Regular Expressions?
by tadman (Prior) on Jun 23, 2001 at 21:50 UTC | |
Here's something that demonstrates my idea: If you have many, many different patterns to define, you might want to eval them into the hash, like so: Of course, taking special care to ensure $regex was a self-contained regex (i.e. /x/ or !/!) and did not contain anything that was going to be invalid when eval'd, though of course you can always check $@ and see what went wrong. The advantage to using a full sub over just a regex is that you can validate in a context outside of a regex just to be sure. For example, you could check that the day of the month was actually 31 or less, instead of possibly 39. | [reply] [d/l] [select] |
|
Re: Hash/Array of Regular Expressions?
by Aighearach (Initiate) on Jun 23, 2001 at 23:55 UTC | |
I use lists of hashes as fast "objects" in parsers a lot. I like compiled regexes in this case, because the way I use them I am putting changed versions in a new slot anyway. To do this I use the quote regex operator, qr// It gets ugly really fast, though... real OO is "better," except when its too slow, like if you have hundreds of parsers. Though the whole thing could be hidden behind a object, but that's a religious war I'm not qualified to fight; I like it both ways. -- Snazzy tagline here | [reply] [d/l] |
by bikeNomad (Priest) on Jun 24, 2001 at 23:14 UTC | |
$ perl multire.pl
Benchmark: running using alternates, using foreach, each for at least 30 CPU seconds...
using alternates: 30 wallclock secs (30.07 usr + 0.09 sys = 30.16 CPU) @ 2.69/s (n=81)
using foreach: 30 wallclock secs (30.09 usr + 0.02 sys = 30.11 CPU) @ 0.20/s (n=6)
s/iter using foreach using alternates
using foreach 5.02 -- -93%
using alternates 0.372 1248% --
Here's the code:
update: clarified callbacks: they don't have to do the same thing. | [reply] [d/l] |
by Aighearach (Initiate) on Jun 25, 2001 at 01:01 UTC | |
OTOH, a 6 mile regex could be tough to debug... what would be nice would be a module that would let you put in regexes, and tell it to do the matching one way or the other, that way if you've got a lot of them you can switch to the looping mode when you need to find the broken bits.
| [reply] |
by bikeNomad (Priest) on Jun 25, 2001 at 01:50 UTC | |
by Aighearach (Initiate) on Jun 25, 2001 at 02:05 UTC | |
|
Re: Hash/Array of Regular Expressions?
by holygrail (Scribe) on Jun 23, 2001 at 20:52 UTC | |
--HolyGrail | [reply] [d/l] |
|
Re: Hash/Array of Regular Expressions?
by JohnAndy (Initiate) on Jul 05, 2007 at 16:44 UTC | |
| [reply] |