Too much SQL not enough perl

jcpunk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Too much SQL not enough perl by rnahi (Curate) on Oct 10, 2005 at 05:08 UTC
A realistic way of doing this efficiently in Perl is with List::Util. `use strict; use warnings; use List::Util qw(first); my @questions = ( 'a' .. 'z' ); my $column = 'q'; if ( first { $column eq $_ } @questions ) { print "'$column' is in <@questions>\n"; }` [download]	[reply] [d/l]
Re: Too much SQL not enough perl by Aristotle (Chancellor) on Oct 10, 2005 at 02:17 UTC
`if( grep $question eq $_, 'a', 'b' 'c' ) { # ... }` [download] Or, depending on what you're doing, `my %questions; undef @questions{ 'a', 'b', 'c' }; # and then if( exists $questions{ $question } ) { # ... }` [download] Update: `s/==/eq/` per thor’s notice. Makeshifts last the longest.	[reply] [d/l] [select]
Re: Too much SQL not enough perl by saintmike (Vicar) on Oct 10, 2005 at 02:26 UTC
... somewhat futuristic with Quantum::Superpositions: `use Quantum::Superpositions; my $cool = any("Chuck", "Mick", "Joe"); for my $name (qw(Schmoe Joe Chuck)) { if ($name eq $cool) { print "$name is cool\n"; } }` [download] Since `$cool` is a (disjunctive) superposition of three values, the expression `$name eq $cool` is true for three different values of `$name`: `"Chuck", "Mick",` and `"Joe"`.	[reply] [d/l] [select]
Re: Too much SQL not enough perl by EvanCarroll (Chaplain) on Oct 10, 2005 at 02:17 UTC
A) `if ( $question eq 'a' \|\| $question eq 'b' \|\| $question eq 'c' )` B) `if ( grep m/$question/, qw/a b c/ )` update I figured I would add this is solved in perl6 also. Evan Carroll www.EvanCarroll.com	[reply] [d/l] [select]
Re: Too much SQL not enough perl by Skeeve (Parson) on Oct 10, 2005 at 05:57 UTC
why don't you simpy create your "in" subroutine? `sub in { my($inQuestion)= shift; foreach (@_) { return 1 if $_ eq $inQuestion; } return false; }` [download] Okay... It looks somewhat different, but it's shorter `in($question, "a", "b", "c");` [download] OTOH: If it's really a fixed set of allowed values, I'd go for a regex or hash (depending on the nature of the problem) too. if it's just the question, whether or not it's in a set of values, I'd take the regex approach like `if ($day =~ /^(?:Mo\|Tu\|We\|Th\|Fr\|Sa\|Su)$/) ...` [download] if the options are also used to switch, what's done later, I'd prefer the hash way: `if (defined $mnth= $monthNum{$month}) ...` [download] `$\=~s;s.;q^\|D9JYJ^^qq^\//\\\///^;ex;print`	[reply] [d/l] [select]
Re^2: Too much SQL not enough perl by nothingmuch (Priest) on Oct 10, 2005 at 08:12 UTC
You can try to be really sick and create an in method using autobox. -nuffin zz zZ Z Z #!perl	[reply]
Re: Too much SQL not enough perl by InfiniteSilence (Curate) on Oct 10, 2005 at 03:44 UTC
You could answer the problem by using grep or the way you specified in your question, but if you are considering using it in the same manner that it works in SQL you need a hash for performance reasons: #!/usr/bin/perl -w use strict; use Benchmark qw(:all); my @choices = qw\|a b c\|; my %choices = map{$_=>1}@choices; my @cData = <DATA>; timethese(100000,{'Damn Slow'=>\&parseData1, 'Much Better'=>\&parseData2}); sub parseData1 { foreach (@cData){ chomp; my @items = split/,/,$_; #slow way, foreach my $item (@items){ if(grep {$item eq $_} @choices){ #print qq\|FOUND $item\|; } } } } sub parseData2 { foreach(@cData){ chomp; foreach my $item (split/,/,$_){ if ($choices{$item}){ #print qq\|FOUND $item\n\|; } } } } __DATA__ z,t,m,u,a,b,c s,t,l,m,z,a,s c,b,a,m,u,t,n k,l,t,s,z,r,t [download] Produces: `Benchmark: timing 100000 iterations of Damn Slow, Much Better... Damn Slow: 9 wallclock secs ( 8.81 usr + 0.00 sys = 8.81 CPU) @ 11 +348.16/s n=100000) Much Better: 4 wallclock secs ( 3.88 usr + 0.00 sys = 3.88 CPU) @ 2 +5806.45/s (n=100000)` [download] Celebrate Intellectual Diversity	[reply] [d/l] [select]
Re^2: Too much SQL not enough perl by EvanCarroll (Chaplain) on Oct 10, 2005 at 04:29 UTC
I do not think it is fair you load up your hash prior to the benchmarking, that skews the results. Nor, is it fair that in the 'slow' one, you use a temporary array `my @items = split/,/,$_; foreach my $item (@items){` [download] vs `foreach my $item (split/,/,$_){` [download] Also, `if ($choices{$item})` should probably be `if (exists $choices{$item})` or you will choke on 0, empty strings, and undefs. UPDATE: Nor, is a grep a good idea, quoting a passage I remember reading in perldoc perlfaq: perldoc -q unique: These are slow (grep) (checks every element even if the first matches), inefficient (same reason), and potentially buggy But, granted it still says: Hearing the word "in" is an indication that you probably should have used a hash, not a list or array, to store your data. Hashes are designed to answer this question quickly and efficiently. Arrays aren’t. Evan Carroll www.EvanCarroll.com	[reply] [d/l] [select]
Re^3: Too much SQL not enough perl by Roy Johnson (Monsignor) on Oct 10, 2005 at 13:33 UTC
Note that with a little skullduggery, you can make `grep` short-circuit. Granted, it's still much more clear to use `List::Util 'first'`, and for searching the same candidate list many times, it's more efficient to use a hash. `my @candidates = qw(z y a b c a d a e a f); foreach my $question ('a', 'm') { if (do{{;grep {$_ eq $question ? do {print "Match\n"; last} : 0 } @candidates}}) { print "Found $question\n"; } }` [download] Caution: Contents may have been coded under pressure.	[reply] [d/l] [select]
Re^4: Too much SQL not enough perl by Aristotle (Chancellor) on Oct 10, 2005 at 14:48 UTC
Re^5: Too much SQL not enough perl by Roy Johnson (Monsignor) on Oct 10, 2005 at 14:58 UTC
Some notes below your chosen depth have not been shown here
Re^3: Too much SQL not enough perl by InfiniteSilence (Curate) on Oct 10, 2005 at 15:10 UTC
Nor, is it fair that in the 'slow' one, you use a temporary array... EvanCarroll is right about the slight differences in the routines, so I commented out some things in code and made them both use the same array (note, I increased the number of elements to look for as well as the number of iterations for more meaningful results): #!/usr/bin/perl -w use strict; use Benchmark qw(:all); my @choices = qw\|a b c z m p t l c f g\|; my %choices = map{$_=>1}@choices; my @cData = <DATA>; timethese(50_000,{'Damn Slow'=>\&parseData1, 'Much Better'=>\&parseData2}); sub parseData1 { foreach (@cData){ chomp; my @items = split/,/,$_; #slow way, foreach my $item (@items){ if(grep {$item eq $_} @choices){ #print qq\|FOUND $item\|; } } } } sub parseData2 { foreach(@cData){ chomp; # foreach my $item (split/,/,$_){ my @items = split/,/,$_; #slow way, foreach my $item (@items){ if ($choices{$item}){ #print qq\|FOUND $item\n\|; } } } } __DATA__ z,t,m,u,a,b,c s,t,l,m,z,a,s c,b,a,m,u,t,n k,l,t,s,z,r,t [download] Which still produces: `perl seediff.pl Benchmark: timing 50000 iterations of Damn Slow, Much Better... Damn Slow: 6 wallclock secs ( 6.12 usr + 0.00 sys = 6.12 CPU) @ 81 +72.61/s (n=50000) Much Better: 3 wallclock secs ( 2.96 usr + 0.00 sys = 2.96 CPU) @ 1 +6874.79/s (n=50000)` [download] I think the above changes were 'fair' since the performance of a function like SQL in() in an actual database should not decrease noticeably with the number of elements. Celebrate Intellectual Diversity	[reply] [d/l] [select]
Re: Too much SQL not enough perl by tirwhan (Abbot) on Oct 10, 2005 at 08:09 UTC
Lots of valid solutions proposed above, here's how it looks with Perl6::Junction: `use Perl6::Junction qw(any); if (any(qw(a b c)) eq $question) { #... }` [download]	[reply] [d/l]
Re: Too much SQL not enough perl by pg (Canon) on Oct 10, 2005 at 02:24 UTC
In a DB application, the solution with SQL in operator is in deed a decent one. If you can avoid coding, just avoid it, no matter through SQL or Perl. If you happened to have a hash that logically fits your purpose, the exists() function might help. But in operator is much more handy. Think in this way, what if the selection is indeed the result of a sub query. Do you want to write the whole logic yourself, or just use SQL query.	[reply]
Re: Too much SQL not enough perl by osunderdog (Deacon) on Oct 10, 2005 at 12:13 UTC
I prefer to use Set::Scalar for this type of condition. `use strict; use Set::Scalar; my @testItems = qw\|a b c d e f g\|; my $set = Set::Scalar->new('a', 'b', 'c'); foreach my $item (@testItems) { if($set->has($item)) { print "$item is in set\n"; } } __END__ Results: a is in set b is in set c is in set` [download] Although I haven't looked at the performance of this package compared to methods suggested in this thread. Hazah! I'm Employed!	[reply] [d/l]
Re: Too much SQL not enough perl by ambrus (Abbot) on Oct 10, 2005 at 19:39 UTC
If you think a function would be useful, why don't you just write it? (This code is UNTESTED.) `sub in { my $val = shift; $val == $_ and return 1 for $_; return; } in $question, ($a, $b, $c);` [download]	[reply] [d/l]
Re: Too much SQL not enough perl by Anonymous Monk on Oct 10, 2005 at 15:46 UTC
You are right, Perl should incorporate that even more so because it already exists in the 'dreaded snake' ;-) language: (straight from the doc) `>>> # Measure some strings: ... a = ['cat', 'window', 'defenestrate'] >>> for x in a: ... print x, len(x) ... cat 3 window 6 defenestrate 12` [download]	[reply] [d/l]
Re: Too much SQL not enough perl by gam3 (Curate) on Oct 15, 2005 at 03:58 UTC
Some benchmarks: Read more... (2 kB) Rate osunderdog rnahi gam3 ambrus Aristotle EvanCarroll osunderdog 3412/s -- -89% -98% -99% -99% -99% rnahi 30117/s 783% -- -79% -87% -94% -95% gam3 143359/s 4101% 376% -- -40% -73% -75% ambrus 238312/s 6884% 691% 66% -- -54% -58% Aristotle 521308/s 15177% 1631% 264% 119% -- -8% EvanCarroll 566538/s 16502% 1781% 295% 138% 9% -- -- gam3 A picture is worth a thousand words, but takes 200K.	[reply] [d/l]