monkfan has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,
Forgive me for reposting. I pressed 'create' mistakenly. Also there is some segment missed out in the code before. This is the first time I post here.
I have a following code that find a "k-clique" within a graph.
k-Clique is a complete subgraph of size 'k'.
Currently running this code gives:
$ perl graph.pl 3 5 6 9 $ perl graph.pl 4 1 2 3 4
As you can see the code below only return the "last" k-clique found. My question is how can I modify my code below so it can store and then returns all the "k-clique" found? That is:
$ perl graph.pl 3 would give: 1 2 3 1 2 4 1 3 4 2 3 4 5 6 9
Charles suggested that I use loop to step through the vertices or regex for the second part of the string.
But I still can't figure it out in concrete way.
Seems that my regex knowledge is too limited.
Dear monks, to you now I turn too...
Thanks beforehand!

Regards,
Edward WIJAYA
_BEGIN__ #Credit Greg Bacon #and suggestion by C.Clarkson my @vertices = 1 .. 9; my @edges = ( [1,2], [1,3], [1,4], [1,5], [2,3], [2,4], [3,4], [5,6], [5,7], [5,9], [6,9], [7,8], [8,9], ); $k = shift || 3; #Construct a string as follows: $string = (join ',' => @V) . ';' . (join ',' => map "$_->[0]-$_->[1]", @E); #Then construct a regular expression as follows: $regex = '^ .*\b ' . join(' , .*\b ' => ('(\d+)') x $k) . '\b .* ;' . "\n"; for ($i = 1; $i < $k; $i++) { for ($j = $i+1; $j <= $k; $j++) { $regex .= '(?= .* \b ' . "\\$i-\\$j" . ' \b)' . "\n"; } } #Our graph contains a k-clique if and only if if ($string =~ /$regex/x) { print join(" ", map $$_, 1..$k), "\n"; } __END__

Replies are listed 'Best First'.
Re: Finding number of 'k'-clique in a graph
by blokhead (Monsignor) on Sep 27, 2004 at 16:26 UTC
    First of all, are you doing this for fun (showing that regex matching is NP-hard) or for real (your boss really wants you to find all k-cliques)? Since you've found Greg Bacon's code, you probably realize that the k-clique problem is NP-complete.

    Solving hard problems with silly regexes is a lot of fun (I've done it myself a few times) but inflexible and hard to maintain. Sure, it might be a little faster since it's in C, but either way it'll be exponential (in this case O(n^k)). As for building a regex to return all the k-cliques, I don't think it's likely. You'd probably have to start from scratch with a new regex and idea. This regex stops as soon as it finds something that matches all the constraints.

    If you are really serious about solving k-clique queries, I'd recommend a more "traditional" algorithm (no regexes). Just iterate through all of the n-choose-k subsets of points (see Iterating over combinations for the combinations sub). Some sort of hash of edges (instead of a list) would be useful so you can do quick lookups. Here's the basic pseudocode:

    my $iter = combinations( $k => \@vertices ); SUBSET: while ( my @subset = $iter->() ) { my $pairs = combinations( 2 => \@subset ); while ( my @pair = $pairs->() ) { next SUBSET unless @pair is an edge in the graph; } print "@subset is a $k-clique\n"; }
    Update: Actually, as Roy Johnson suggested above, augmenting Greg Bacon's original regex to print all k-cliques wasn't too hard. Just change this:
    if ($string =~ /$regex/x) { print join(" ", map $$_, 1..$k), "\n"; }
    To this:
    $regex .= '(?{ print join(" ", map $$_, 1..$k), "\\n" })(?!)'; $string =~ /$regex/x;
    You may have to add use re 'eval' for this to work.. Now, when the regex engine has found k vertices which satisfy the expression so far, the (?{code}) section prints them out, but the (?!) causes the overall expression to fail. So the engine backtracks and keeps trying to select more vertices that match. With your example, I got the output:
    5 6 9 2 3 4 1 3 4 1 2 4 1 2 3
    OK, this is still fun, provided it stays out of production code ;)

    blokhead

      Thanks so much for your reply, Monks.

      To Ray Johnson, my code did'nt work because, I
      forgot to change @V to @vertices and @E to @edges.

      My further question to 'blokhead' is:
      how can I store the output e.g:
      5 6 9 2 3 4 1 3 4 1 2 4 1 2 3
      into an array, with this snippet you've given
      $regex .= '(?{ print join(" ", map $$_, 1..$k), "\\n" })(?!)'; $string =~ /$regex/x;
      because when I used:
      push @string, $string;
      doesn't seems to work.

      Regards,
      Edward WIJAYA
        Hi blokhead, I've managed, to stored it into an array by doing this:
        my @cliques = (); $regex .= '(?{ push (@cliques, join(" ", map $$_, 1..$k)) })(?!)'; $string =~ /$regex/x; print join("\n", @cliques), "\n";
        I just need to understand a bit on (?{}) part.
        Glad I learnt that.

        Thanks a again. for your kind suggestion. I will play around with your combinatorial subroutine. It seems easier to understand than Regex approach.

        BTW, thanks for the wonderful links you have on various CS items in your "monks home", I found it very useful.

        Gratefuly yours, Edward WIJAYA
Re: Finding number of 'k'-clique in a graph
by Roy Johnson (Monsignor) on Sep 27, 2004 at 16:23 UTC
    When I run your code, I get no output at all. There is a trick that I can suggest, and I'll illustrate it with a simpler example. Say you want to find all the sequences of three digit-strings in a string, so if your string is "11 22 33 44", you'd get "11 22 33" and "22 33 44".

    The trick is to include some code that runs when you've found a match, and then cause the whole expression to fail, so that it backtracks to try again.

    $_ = '11 22 33 44'; /\b\d+ \d+ \d+\b(?{print "$&\n"})^/;
    I've told it to match the beginning of string after it executes my code. That's an impossible match, so the regex goes back and tries again.

    Caution: Contents may have been coded under pressure.