Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Pure regex Hamiltonian Circuit solution

by blokhead (Monsignor)
on Oct 10, 2003 at 17:44 UTC ( [id://298341]=perlmeditation: print w/replies, xml ) Need Help??

I'd seen Abigail do nasty things with regexes before, but never really understood them at all. However, in the recent N-Queens solver, it was a pure regex (backrefs only), there were no (?{}) constructs, and I was finally able to understand how Abigail does it..

Then, on a regex kick, I discovered Abigail's pure regex 3-SAT reduction. If you haven't already seen it, it's the coolest couple of lines on the planet. With all this regex excitement, I decided to try my hand at solving some NP-complete problem(s) with pure regexes.

I tried a handful of different NP-complete problems, but after a few emails with MJD, I understood why these attempts went wrong. I kept getting hung up on the string and regex size being exponential on the size of input (not meaning my solution was incorrect, but invalidating its use as an NP-completeness reduction proof). Finally, I think I may have gotten it right with Hamiltonian Circuits. Abigail has done this before using extended regexes, but claimed to be stuck on a pure regex solution that wasn't exponential.

So without further ado, here's my solution. Given a (simple, undirected) graph with E edges and V vertices, it finds a Hamiltonian Circuit (a cycle that visits every vertex exactly once, starting and ending in the same place). The size of the string it creates is bounded by O(V^4), and the size of the regex it creates is bounded by O(V^2).

my @E = ([1,3],[1,5],[2,3],[2,4],[2,5],[4,5]); my $V = 5; my $verbose = 1; my @all_edges = map { my $x = $_; map { [$x, $_] } $x+1 .. $V } 1 .. $ +V-1; my $string = (join(' ', 1 .. $V) . "\n") x $V . "\n" . (join(' ', map { join "-", @$_ } @all_edges ) . "\n") x @all_edges . "\n" . (join(' ', map { join "-", @$_ } @E ) . "\n") x $V; my $regex = "^ " . ".* \\b (\\d+) \\b .* \\n\n" x $V . "\\n\n" . join("", map { my ($x, $y) = @$_; ".* \\b (?: \\$x-\\$y | \\$y-\\$x ) \\b .* +\\n\n" } @all_edges) . "\\n\n" . join("", map { my ($x, $y) = ($_, $_+1); ".* \\b (?: \\$x-\\$y | \\$y-\\$x ) \\b .* +\\n\n" } 1 .. ($V-1)) . ".* \\b (?: \\$V-\\1 | \\1-\\$V ) \\b .* \\n \$\n"; print "'$string' =~ /\n$regex\n/x\n" if $verbose; if (my @c = $string =~ /$regex/x) { local $" = " -> "; print "Hamiltonian circuit: [ @c -> $1 ]\n"; } else { print "No Hamiltonian circuit\n"; }
Now the dirty details.. Here's the value of $string and $regex for the example graph included:
$string = q[ 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5 1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5 1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5 1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5 1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5 1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5 1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5 1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5 1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5 1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5 1-3 1-5 2-3 2-4 2-5 4-5 1-3 1-5 2-3 2-4 2-5 4-5 1-3 1-5 2-3 2-4 2-5 4-5 1-3 1-5 2-3 2-4 2-5 4-5 1-3 1-5 2-3 2-4 2-5 4-5 ]; $regex = q[^ .* \b (\d+) \b .* \n .* \b (\d+) \b .* \n .* \b (\d+) \b .* \n .* \b (\d+) \b .* \n .* \b (\d+) \b .* \n \n .* \b (?: \1-\2 | \2-\1 ) \b .* \n .* \b (?: \1-\3 | \3-\1 ) \b .* \n .* \b (?: \1-\4 | \4-\1 ) \b .* \n .* \b (?: \1-\5 | \5-\1 ) \b .* \n .* \b (?: \2-\3 | \3-\2 ) \b .* \n .* \b (?: \2-\4 | \4-\2 ) \b .* \n .* \b (?: \2-\5 | \5-\2 ) \b .* \n .* \b (?: \3-\4 | \4-\3 ) \b .* \n .* \b (?: \3-\5 | \5-\3 ) \b .* \n .* \b (?: \4-\5 | \5-\4 ) \b .* \n \n .* \b (?: \1-\2 | \2-\1 ) \b .* \n .* \b (?: \2-\3 | \3-\2 ) \b .* \n .* \b (?: \3-\4 | \4-\3 ) \b .* \n .* \b (?: \4-\5 | \5-\4 ) \b .* \n .* \b (?: \5-\1 | \1-\5 ) \b .* \n $ ]; __OUTPUT__ Hamiltonian circuit: [ 5 -> 4 -> 2 -> 3 -> 1 -> 5 ]
Here's how it works:
  • In the first "paragraph", $1 through $5 each get a vertex assigned to them. This paragraph of $string is bounded by O(V^2), and $regex by O(V).

  • The second "paragraph" is the ugly bit. We have to make sure all the vertices chosen are different. This could have been done by originally choosing $1 through $5 to be any permutation from an exhaustive list of permutations, but such a list would have been exponential in size.

    The way I checked is by picking any 5 vertices, then ensuring that they are all pairwise distinct. In $string, we have a repeated list of all "valid" (distinct) vertex pairs. In $regex, we make sure that every pair of two vertices in {$1,..,$5} shows up in that list.

    This isn't exponential, because there are V(V-1)/2 "valid" (distinct) pairs, and V(V-1)/2 pairs to verify. So this paragraph of $string is bounded by O(V^4), and $regex by O(V^2).

  • The third "paragraph" is where we verify that our distinct vertex set has edges from $1 to $2 ... to $5 and then back to $1. This paragraph of $string is bounded by O(E*V) = O(V^3), and $regex by O(V).

    If all these conditions matched, then the regex matches, and $1 through $5 must be the vertices of a Hamiltonian Circuit.

There you have it. Your feedback is welcome. I really hope I haven't made any mistakes in my analysis. This code isn't one-tenth as beautiful and elegant as Abigail's 3-SAT reduction, and perhaps it could be improved upon. But hey, we all have to start somewhere ;)

Update: modified code so that $string is a little clearer to read. Instead of comma-separating the edge and vertex lists, they are separated by spaces. $regex is modified accordingly, matching on \b where appropriate, instead of a comma.

blokhead

Replies are listed 'Best First'.
Re: Pure regex Hamiltonian Circuit solution
by Abigail-II (Bishop) on Oct 11, 2003 at 02:05 UTC
    Nice. You beat me to it. I was already pretty sure earlier this week that Hamiltonian Circuits/Paths could be done with pure regexes as well, and just tonight I was making some notes on how to do it. It basically came down to the same principles you are using - although I was thinking of mixing the picking of the vertices with the testing for unique picks and valid paths (just to gain speed by rejecting earlier).

    Abigail

Re: Pure regex Hamiltonian Circuit solution
by BrowserUk (Patriarch) on Oct 11, 2003 at 02:26 UTC

    ++blokhead. My hat is off to you.

    I'd really like to see two things.

    1. The same algorithm implemented using normal perl. Iterative, recursive, a mixture of the two. Whatever, so long as it solved the same problem (and arrived at the same answer:).
    2. A benchmark of a few, well chosen, 'typical cases' using your pure-regex, Abigail's extended regex, and the pure perl solutions.

    Throw in a CPAN Perl-over-'generic paths'-C-library, solution and we'd have a very interesting shoot-out:)


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail

Re: Pure regex Hamiltonian Circuit solution
by blokhead (Monsignor) on Oct 13, 2003 at 05:50 UTC
    Here's an updated version using lookaheads to greatly shorten things. Matching time shouldn't change much.
    my @E = ([1,3],[1,5],[2,3],[2,4],[2,5],[4,5]); my $V = 5; my $verbose = 1; my @all_edges = map { my $x = $_; map { [$x, $_] } $x+1 .. $V } 1 .. $ +V-1; my $string = (join(' ', 1 .. $V) . "\n") x $V . join(' ', map { join "-", @$_ } @all_edges ) . "\n" . join(' ', map { join "-", @$_ } @E ); my $regex = "^\n" . ".* \\b (\\d+) \\b .* \\n\n" x $V . join("", map { my ($x, $y) = @$_; "(?= .* \\b (?: \\$x-\\$y | \\$y-\\$x ) \\b + ) \n" } @all_edges) . ".*\\n\n" . join("", map { my ($x, $y) = ($_, $_+1); "(?= .* \\b (?: \\$x-\\$y | \\$y-\\$x ) \\b + )\n" } 1 .. ($V-1)) . "(?= .* \\b (?: \\$V-\\1 | \\1-\\$V ) \\b )\n"; print "'$string' =~ /\n$regex\n/x\n" if $verbose; if (my @c = $string =~ /$regex/x) { local $" = " -> "; print "Hamiltonian circuit: [ @c -> $1 ]\n"; } else { print "No Hamiltonian circuit\n"; } __END__ $string = q[ 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5 1-3 1-5 2-3 2-4 2-5 4-5 ]; $regex = q[ ^ .* \b (\d+) \b .* \n .* \b (\d+) \b .* \n .* \b (\d+) \b .* \n .* \b (\d+) \b .* \n .* \b (\d+) \b .* \n (?= .* \b (?: \1-\2 | \2-\1 ) \b ) (?= .* \b (?: \1-\3 | \3-\1 ) \b ) (?= .* \b (?: \1-\4 | \4-\1 ) \b ) (?= .* \b (?: \1-\5 | \5-\1 ) \b ) (?= .* \b (?: \2-\3 | \3-\2 ) \b ) (?= .* \b (?: \2-\4 | \4-\2 ) \b ) (?= .* \b (?: \2-\5 | \5-\2 ) \b ) (?= .* \b (?: \3-\4 | \4-\3 ) \b ) (?= .* \b (?: \3-\5 | \5-\3 ) \b ) (?= .* \b (?: \4-\5 | \5-\4 ) \b ) .*\n (?= .* \b (?: \1-\2 | \2-\1 ) \b ) (?= .* \b (?: \2-\3 | \3-\2 ) \b ) (?= .* \b (?: \3-\4 | \4-\3 ) \b ) (?= .* \b (?: \4-\5 | \5-\4 ) \b ) (?= .* \b (?: \5-\1 | \1-\5 ) \b ) ];
    There's no more need for repeating the listing of @all_edges so many times. With lookaheads, it only needs to be there once. Same with the listing of @E. This reduces $string to O(V^2).

    Because backtracking doesn't happen within lookaheads, I couldn't use lookaheads to select the captured vertices. So the listing of vertices 1 to $V still has to be there $V times.

    blokhead

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://298341]
Approved by Aristotle
Front-paged by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (10)
As of 2024-03-28 12:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found