in reply to Sub set where all are connected

Here's the latest, fastest version of my program. Is it fast enough to solve a 100,000 node case? No.

#!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11109069 # clique use warnings; use List::Util qw( uniq ); my $edges = <<END; # from https://en.wikipedia.org/wiki/Clique_(graph_ +theory) [1,2],[1,3],[1,4], [2,3],[2,4], [3,4], [4,5],[23,4], [5,6],[5,7],[5,8], [6,7], [7,8], [8,9],[10,8], [10,9],[11,9],[12,9],[13,9], [10,13], [11,12],[11,13], [12,13], [13,14], [14,15],[14,21], [15,16],[15,17],[15,19], [16,17], [17,18],[17,19], [18,19],[18,20],[18,21], [19,20],[19,21],[19,22], [20,23], [21,22],[21,23], [22,23], END $edges =~ s/(?<=\[)[\w,]+(?=\])/ join ',', sort split ',', $& /ge; # +fix order print "$edges\n"; my %edges = map +( $_ => '(*FAIL)' ), $edges =~ /\w+,\w+/g; my %cliques; my %seen; find( uniq sort $edges =~ /\w+/g ); # start with e +very node sub find { $seen{ my $set = "@_" }++ and return; if( my @out = $set =~ /\b(\w+)\b.+\b(\w+)\b(??{ $edges{"$1,$2"} || " +" })/ ) { for my $node ( @out ) # pair of unconnected nodes, try without + each one { @_ > 3 and find( grep $_ ne $node, @_ ); } } else { $cliques{ $set }++; # it is fully +connected } } my $uniquecliques = ''; for ( sort { length $b <=> length $a } sort +uniq keys %cliques, map tr/,/ /r, keys %edges ) { my $pattern = " $_ " =~ s/\w+/\\b$&\\b/gr =~ s/ /.*?/gr; $uniquecliques =~ /^$pattern$/m or $uniquecliques .= "$_\n"; } print $uniquecliques;

Outputs:

[1,2],[1,3],[1,4], [2,3],[2,4], [3,4], [4,5],[23,4], [5,6],[5,7],[5,8], [6,7], [7,8], [8,9],[10,8], [10,9],[11,9],[12,9],[13,9], [10,13], [11,12],[11,13], [12,13], [13,14], [14,15],[14,21], [15,16],[15,17],[15,19], [16,17], [17,18],[17,19], [18,19],[18,20],[18,21], [19,20],[19,21],[19,22], [20,23], [21,22],[21,23], [22,23], 11 12 13 9 15 16 17 15 17 19 17 18 19 18 19 20 18 19 21 19 21 22 21 22 23 1 2 3 4 10 13 9 10 8 9 13 14 14 15 14 21 20 23 5 6 7 5 7 8 23 4 4 5

This 23 node case runs in about 0.05 seconds on my machine.

I am curious about what the real problem is here. Perhaps there is something in the real problem that could lead to partioning or something else that could limit that actual node size per case.

Also, how many edges are there in your 100,000 node case?

Replies are listed 'Best First'.
Re^2: Sub set where all are connected
by Sanjay (Sexton) on Jan 09, 2020 at 13:41 UTC

    Not examined the number of edges. As a wild guess I would say the 100,000 node case may have 200,000 to 400,000 edges. Ever since I learnt that it is a hard problem, I lost interest. I am using another method (linear programming) to get (sub)-cliques optimized to maximize some objective function. Works for around 98+% cases. Times out or unfeasible for the rest because of problems with the (free) software or insufficient resources. I was thinking of cliques to offer an alternative solution. Now I will specify that an optimized solution cannot be found. Good luck to my client!