No such thing as a small change PerlMonks

### Re: Create union from ranges, but compare respectively

by kcott (Archbishop)
 on Jun 10, 2022 at 09:07 UTC Need Help??

Out of curiosity, and for a bit of fun, I wondered how a pure Perl solution would stack up against the two solutions using CPAN modules.

```#!/usr/bin/env perl

use strict;
use warnings;

use Benchmark 'cmpthese';
use Test::More tests => 3;

my \$TM_part1 = "25-40,74-93,95-120,130-149";
my \$TM_part2 = "31-47,84-99,107-123,137-151";
my @split_TM1 = split ',', \$TM_part1;
my @split_TM2 = split ',', \$TM_part2;
my \$union = '31-40,84-93,107-120,137-149';

is _span(), \$union;
is _pair(), \$union;
is _perl(), \$union;

cmpthese 0 => {
Span => \&_span,
Pair => \&_pair,
Perl => \&_perl,
};

sub _span {
use Set::IntSpan;

my @u_set;

for my \$i (0 .. \$#split_TM1) {
my \$set1 = Set::IntSpan->new(\$split_TM1[\$i]);
my \$set2 = Set::IntSpan->new(\$split_TM2[\$i]);
my \$u_set = intersect \$set1 \$set2;

push @u_set, \$u_set;
}

return join ',', @u_set;
}

sub _pair {
use List::AllUtils qw( max min pairwise );

return join ',', pairwise {
my \$low = max map /^(\d+)/, \$a, \$b;
my \$high = min map /-(\d+)/, \$a, \$b;
\$low <= \$high ? "\$low-\$high" : ();
} @split_TM1, @split_TM2;
}

sub _perl {
my @u_set;

for my \$i (0 .. \$#split_TM1) {
my %seen;

for ([split /-/, \$split_TM1[\$i]], [split /-/, \$split_TM2[\$i]])
+ {
++\$seen{\$_} for \$_->[0] .. \$_->[1];
}

push @u_set, join '-', (sort grep \$seen{\$_} == 2, keys %seen)[
+0, -1];
}

return join ',', @u_set;
}

I ran that five times; here's a median result:

```1..3
ok 1
ok 2
ok 3
Rate Span Perl Pair
Span  16937/s   -- -20% -84%
Perl  21145/s  25%   -- -80%
Pair 106033/s 526% 401%   --

If you're interested, feel free to tweak the code for better (different) results.

— Ken

Replies are listed 'Best First'.
Re^2: Create union from ranges, but compare respectively (postgres: int4range-intersect)
by erix (Prior) on Jun 12, 2022 at 07:12 UTC

I improved the SQL, this one here is a bit faster. I also include below changes to your test program, as lines to be added. The DBI connect will inevitably need some tweaking, depending on host machine.

```use DBI;

#my \$dbh = DBI->connect or die "hm - \$@\n"; # easier when ENV vars are
+ set
my \$dbh  = DBI->connect('dbi:Pg:', "app_user1", "P4ss_W0rd1") or die "
+meh - no db connection - \$@\n" ;

# two test lines:

is _sql()     , \$union;

"SQL/Pg"   => \&_sql,

# and the statement itself
sub _sql  {

return \$dbh->selectrow_arrayref("

select array_to_string( array(
select lower(isect) || '-' || upper(isect)-1 from(
select
('['||replace(unnest(string_to_array('\$TM_part1', ',')), '-', ',
+')||']')::int4range
* ('['||replace(unnest(string_to_array('\$TM_part2', ',')), '-', ',
+')||']')::int4range
) as h(isect)
), ',')

")->[0] ;

}

and this is the new result:

```              Rate     Span     Perl   SQL/Pg     Pair oneliner
Span        8373/s       --     -57%     -68%     -86%    -100%
Perl       19261/s     130%       --     -27%     -69%    -100%
SQL/Pg     26397/s     215%      37%       --     -57%    -100%
Pair       61443/s     634%     219%     133%       --     -99%
oneliner 5644800/s   67313%   29207%   21284%    9087%       --
Re^2: Create union from ranges, but compare respectively (SQL Pg)
by erix (Prior) on Jun 10, 2022 at 11:13 UTC

I had to add SQL (postgres) :P

```            Rate     Span SQL (Pg)     Perl     Pair
Span      8489/s       --     -44%     -56%     -86%
SQL (Pg) 15132/s      78%       --     -22%     -75%
Perl     19388/s     128%      28%       --     -68%
Pair     60727/s     615%     301%     213%       --

```sub _sql {

return \$dbh->selectrow_arrayref("
select array_to_string( array(
select lower(r) || '-' || upper(r)-1 from (
select
(select unnest(r1) limit 1 offset n -1) *
(select unnest(r2) limit 1 offset n -1)   as r
from (values
(('{[' || replace(replace('\$TM_part1', ',', '],['), '-', ','
+) || ']}')::int4multirange,
('{[' || replace(replace('\$TM_part2', ',', '],['), '-', ','
+) || ']}')::int4multirange
)) as f(r1,r2)
, lateral generate_series(1, 4) as g(n)
)       as g
), ',')   as h
;
")->[0] ;

}

Update: see my next message; it has improved/faster SQL and different Benchmark

Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11144645]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (3)
As of 2022-08-10 16:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?

No recent polls found

Notices?