I'd seen Abigail do nasty things with regexes before, but never really understood them at all. However, in the recent N-Queens solver, it was a pure regex (backrefs only), there were no (?{}) constructs, and I was finally able to understand how Abigail does it..

Then, on a regex kick, I discovered Abigail's pure regex 3-SAT reduction. If you haven't already seen it, it's the coolest couple of lines on the planet. With all this regex excitement, I decided to try my hand at solving some NP-complete problem(s) with pure regexes.

I tried a handful of different NP-complete problems, but after a few emails with MJD, I understood why these attempts went wrong. I kept getting hung up on the string and regex size being exponential on the size of input (not meaning my solution was incorrect, but invalidating its use as an NP-completeness reduction proof). Finally, I think I may have gotten it right with Hamiltonian Circuits. Abigail has done this before using extended regexes, but claimed to be stuck on a pure regex solution that wasn't exponential.

So without further ado, here's my solution. Given a (simple, undirected) graph with E edges and V vertices, it finds a Hamiltonian Circuit (a cycle that visits every vertex exactly once, starting and ending in the same place). The size of the string it creates is bounded by O(V^4), and the size of the regex it creates is bounded by O(V^2).

my @E = ([1,3],[1,5],[2,3],[2,4],[2,5],[4,5]); my $V = 5; my $verbose = 1; my @all_edges = map { my $x = $_; map { [$x, $_] } $x+1 .. $V } 1 .. $ +V-1; my $string = (join(' ', 1 .. $V) . "\n") x $V . "\n" . (join(' ', map { join "-", @$_ } @all_edges ) . "\n") x @all_edges . "\n" . (join(' ', map { join "-", @$_ } @E ) . "\n") x $V; my $regex = "^ " . ".* \\b (\\d+) \\b .* \\n\n" x $V . "\\n\n" . join("", map { my ($x, $y) = @$_; ".* \\b (?: \\$x-\\$y | \\$y-\\$x ) \\b .* +\\n\n" } @all_edges) . "\\n\n" . join("", map { my ($x, $y) = ($_, $_+1); ".* \\b (?: \\$x-\\$y | \\$y-\\$x ) \\b .* +\\n\n" } 1 .. ($V-1)) . ".* \\b (?: \\$V-\\1 | \\1-\\$V ) \\b .* \\n \$\n"; print "'$string' =~ /\n$regex\n/x\n" if $verbose; if (my @c = $string =~ /$regex/x) { local $" = " -> "; print "Hamiltonian circuit: [ @c -> $1 ]\n"; } else { print "No Hamiltonian circuit\n"; }
Now the dirty details.. Here's the value of $string and $regex for the example graph included:
$string = q[ 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5 1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5 1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5 1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5 1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5 1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5 1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5 1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5 1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5 1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5 1-3 1-5 2-3 2-4 2-5 4-5 1-3 1-5 2-3 2-4 2-5 4-5 1-3 1-5 2-3 2-4 2-5 4-5 1-3 1-5 2-3 2-4 2-5 4-5 1-3 1-5 2-3 2-4 2-5 4-5 ]; $regex = q[^ .* \b (\d+) \b .* \n .* \b (\d+) \b .* \n .* \b (\d+) \b .* \n .* \b (\d+) \b .* \n .* \b (\d+) \b .* \n \n .* \b (?: \1-\2 | \2-\1 ) \b .* \n .* \b (?: \1-\3 | \3-\1 ) \b .* \n .* \b (?: \1-\4 | \4-\1 ) \b .* \n .* \b (?: \1-\5 | \5-\1 ) \b .* \n .* \b (?: \2-\3 | \3-\2 ) \b .* \n .* \b (?: \2-\4 | \4-\2 ) \b .* \n .* \b (?: \2-\5 | \5-\2 ) \b .* \n .* \b (?: \3-\4 | \4-\3 ) \b .* \n .* \b (?: \3-\5 | \5-\3 ) \b .* \n .* \b (?: \4-\5 | \5-\4 ) \b .* \n \n .* \b (?: \1-\2 | \2-\1 ) \b .* \n .* \b (?: \2-\3 | \3-\2 ) \b .* \n .* \b (?: \3-\4 | \4-\3 ) \b .* \n .* \b (?: \4-\5 | \5-\4 ) \b .* \n .* \b (?: \5-\1 | \1-\5 ) \b .* \n $ ]; __OUTPUT__ Hamiltonian circuit: [ 5 -> 4 -> 2 -> 3 -> 1 -> 5 ]
Here's how it works:

There you have it. Your feedback is welcome. I really hope I haven't made any mistakes in my analysis. This code isn't one-tenth as beautiful and elegant as Abigail's 3-SAT reduction, and perhaps it could be improved upon. But hey, we all have to start somewhere ;)

Update: modified code so that $string is a little clearer to read. Instead of comma-separating the edge and vertex lists, they are separated by spaces. $regex is modified accordingly, matching on \b where appropriate, instead of a comma.

blokhead


In reply to Pure regex Hamiltonian Circuit solution by blokhead

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.