BrowserUk has asked for the wisdom of the Perl Monks concerning the following question:
An 8x8 grid; 6 token types (letters, numbers, colors etc.). Filling the grid with tokens such that there are no more than two of any token adjacent horizontally or vertically. Ie.
F F B B C C D D A B C D E F A B A B C D E F A B F E D B C A F E F F E F F E F F A B C D A B C D C D E F C D E F A A F F A A C C
Generating one is messy but simple. Enumerating them is considerably harder; and would probably take a very long time.
I'd like to know how many legal patterns there are? But I can't wrap my head around how to calculate it.
Any hints or leads appreciated.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Pattern enumeration.
by roboticus (Chancellor) on Jul 27, 2010 at 00:11 UTC | |
To enumerate the series, I think I'd use an approach of treating the board as a vector of 64 symbols[1], set to an arbitrary legal pattern. Then each time you want the next board, advance the first position, advancing the next digit each time the current digit rolls over. The advance code would skip a symbol if the next two positions in the row use that symbol. Similarly, it would skip the symbol if the next two column positions already contain that symbol.
Notes: [1] To simplify the position checking logic, I'd actually make it a larger vector so I can create a border (two wide) on the right and bottom of the board. The border would hold an invalid symbol to make the symbol checking simple. Thus, you could disallow a symbol if:
[2] After further reflection, I find I'm counting excluded boards multiple times, because there will be boards with multiple rows/columns-of-three. I'll give it another minute or two to see if I can refine it a bit. But I think that at least that would give you a lower bound on the number of boards... ...roboticus Update: I moved the last paragraph up before the Notes section, where it was supposed to be. Update 2: Added note [2] Update 3: Gah! I had my exponents consistently backwards (64^6 instead of 6^64). Update 4: Struck out the totally useless paragraph. In this case, a back-of-the-envelope estimate should have remained on the envelope! | [reply] [d/l] |
by tye (Sage) on Jul 27, 2010 at 07:52 UTC | |
Sorry, but your calculation is significantly too simple. An illegal board can fail more than just one constraint. Your 2*48*6*6**61 enumerates the possible paths for a simple algorithm that is guaranteed to produce an illegal board. But a huge number of those paths are just different ways of generating the same board (if a board violates 6 different constraints, then your equation will count that board 6 different times). So you have over estimated the number of illegal boards rather significantly. This is rather easy to see by noting that your estimated number of legal boards comes out to -360*6**61. A negative number of possibilities is clearly the wrong answer. I've computed counts for similar sets in the past. It gets very complicated very quickly. I'll try to write up at least the first part of such a calculation but I doubt I'll be posting that very soon. :) You could get an approximate answer by generating a large number of boards at random and calculating the average number of constraints violated by a random illegal board and then using that number to divide the number you originally calculated. - tye | [reply] |
by BrowserUk (Patriarch) on Jul 27, 2010 at 09:35 UTC | |
You could get an approximate answer by generating a large number of boards at random Generating random boards and then just checking for pass/fail very quickly converges to a figure of 8.9%; giving a approximation of 5.657e+48 legal boards. I don't understand your "calculating the average number of constraints violated by a random illegal board" bit. I do realise that for any given legal board, there are 720 "symmetries", where the arrangement of tokens is the same, but the actual tokens are different. Um. Not a good description. Ie. The following are symmetries because the pattern of the tokens remains the same, though the values of the tokens are different:
For my purpose, reflection and rotational symmetries are different. So, I think that once I've calculated the total number of legal arrangements, I divide by 720 to determine the number of patterns? Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
by ikegami (Patriarch) on Jul 27, 2010 at 18:39 UTC | |
by BrowserUk (Patriarch) on Jul 27, 2010 at 20:39 UTC | |
by roboticus (Chancellor) on Jul 27, 2010 at 10:33 UTC | |
tye: I see what you mean by "It gets very complicated very quickly". I tried to make a new calculation for the number of legal boards, and quickly got bogged down while trying to make sure I didn't count boards multiple times. After working on the problem for a while, I can confidently state the following: I can't afford the amount of coffee, scratch paper, erasers and pencils required for me to figure it out! ...roboticus | [reply] |
by BrowserUk (Patriarch) on Jul 27, 2010 at 08:04 UTC | |
Hm. 2 * 48 * 6 * 6^61 = 1.6890743110126207388309943185817e+50 6^64 = 6.3340286662973277706162286946812e+49 Subtracting the former from the latter is -1.0556714443828879617693714491129e+50? What I think I have:Take one row; ignoring the vertical component for the moment, at each iteration: Giving ( 6 * 6 * 5.833333333^6 ) ^ 8 = 2.220030530726145590197494583735e+47 However, for the 3rd and subsequent rows, the choice at each position is further constrained by the existing values for the previous two cells in the column. For the first two columns, 30/36 the choice will be 6; 6/36 it will be 5 giving the familiar 5.8333 average. But for the 3rd and subsequent columns, it gets messy. I thought that it might be : 28.1944 / 5.8333^2 choice is 6; 5.83333 / 5.8333^2 choice is 5; giving an average of: 5.82857142 For each of the 6x6 cells in the bottom right. For an overall calculation of: ( 6**2 * 5.833333333333333333333333333333**6 )**2 * ( ( 5.833333333333333333333333333333**2 * 5.8285714285714285714285714285714**6 )**6 ) = 1.13460446277538e+049 But that doesn't make sense because it is bigger than the previous result constrained in only one direction :(. In either case, it is far to big a number to actually iterate. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] |
|
Re: Pattern enumeration.
by NetWallah (Canon) on Jul 26, 2010 at 23:44 UTC | |
This is similar to the classic "knight moves" chess problem.
I'd suggest a single-dimension array with 8 elements. With this structure, the problem reduces to identifying rules to generate legal adjacent bit patters. It's late in my day - brain too fried to attempt that now - I'm sure wiser, more alert monks will oblige. Cheers.
Update:Sigh .. - it is even later in the night, but I'm bothered because the representation above is not right. So, a single dim array, where each element holds 24 bits can represent your problem. If you shift items in sequences of 6 bits, you can identify legal vertical patterns. The same rules can apply to adjacent indexes masking out the right bits. Brain still to fried to conceptualize the actual patterns. Syntactic sugar causes cancer of the semicolon. --Alan Perlis | [reply] |
|
Re: Pattern enumeration.
by ikegami (Patriarch) on Jul 27, 2010 at 18:28 UTC | |
The number of patterns if phenomenal.
Here's a trivial solution for 8x8 with at most 6 symbols:
I'm using the regex engine to generate solutions for speed:
Update: I thought wasn't allowed. Fixed. | [reply] [d/l] [select] |
by BrowserUk (Patriarch) on Jul 27, 2010 at 20:59 UTC | |
I'm using the regex engine to generate solutions for speed: Hm.
I read that as 230 microseconds per iteration. Not sure if that is generated or tested good, but even if the latter; and assuming a (way) underestimate of 5^64; then 3 779 192 301 486 978 976 247 237 055 279 666 years. I don't think that throwing threads, or even a google, at it will help any :) Maybe a few giant red spots. | [reply] [d/l] |
by ikegami (Patriarch) on Jul 27, 2010 at 21:24 UTC | |
I started playing with a recursive solution that builds grids from smaller grids. Bonus: This method naturally omits permutations. For example, given One can overlap the 2x2 grids to form:
I think it might be possible to turn this into an efficient solution for counting the arrangements instead of generating them. That's all I had time to do for now. Update: At time of writing, I thought wasn't allowed. The concept still applies, although there would be a lot more possible grids. | [reply] [d/l] [select] |
by BrowserUk (Patriarch) on Jul 27, 2010 at 22:25 UTC | |
by JavaFan (Canon) on Jul 27, 2010 at 23:39 UTC | |
| |
|
Re: Pattern enumeration.
by JavaFan (Canon) on Jul 27, 2010 at 12:48 UTC | |
Mathworld has something to say about the number of coloured graphs where no two colours may be adjacent (which is more restrictive than your problem). It suggest that using brute force to calculate the answer to your problem isn't viable. | [reply] |
by ambrus (Abbot) on Jul 29, 2010 at 16:27 UTC | |
No, what you write is not the same problem I think. The original question only bans three adjacent cells of the same color in a row or a column, not L-shaped. | [reply] |
|
Re: Pattern enumeration. (ugly)
by tye (Sage) on Jul 28, 2010 at 06:02 UTC | |
Here is a rather informal presentation of the start of a way to come up with an exact count, at least in theory.
Divide the space of all possible fillings into 97 buckets:
Fill the grid, counting the possibilities until you get to a point when it becomes possible to determine if a constraint has been violated. At this point, you must split the building total into two: Those that violate the next constraint and those that don't. You take the first number and multiple it by 6**(number of unfilled locations) and that is the total value for B$N. The second "number" is used to continue calculations, filling in more buckets. Start with 6**64. You can trivially calculate and subtract B1 to get much closer to the number of legal fillings. A little bit more work lets you subtract B2 to get a little closer. After a lot of work, you can subtract B6. Each subsequent bucket size calculation gets several-fold more complex to calculate but also gets less significant. Perhaps you get to a point where you are close enough or just extrapolate an approximation from the patterns you start to observe. Anyway, I track the exact calculations as a list of products. Sum up all of the products (except the first one) and if they don't equal the first product, then you made a mistake somewhere. Start with an obvious choise of C1..C6 being the constraints that apply to the first row. Look at ways to fill just the first 3 locations:
But it gets more complicated than that in the next step so we preplan for dealing with C2:
Now expand this to cover the first 4 locations:
The items under B3,4 get expanded under B3,5 in one way and under B0,5 another way. The items under B0,4 get expanded under B4,5 in one way and under B0,5 another way:
Read more... (4 kB) Which gives us:
Which shows that the values to be subtracted don't decrease in size very fast at all so far:
Our approximations at B0 aren't converging very fast:
Which means that the number of boards that don't violate any horizontal contraints is:
Yes, of course there are patterns in the above work that can be generized or you can write code to get to this point after you've seen how each step goes. But, trying to deal with the vertical constraints make the previous work look down-right simple by comparison! So any generalizations from the above work would likely have to be thrown away in our next phase anyway. Although there are nearly 2e49 boards to search through, we represent about 1.4e6 of those with just 21 short products. So, by the time we have 8 rows, we might have somewhere remotely close to 21**8 products to keep track of. 37,822,859,361 arrays is not going be something most computers will easily handle. But that at least might be possible versus trying to enumerate 2e49 possibilities. Probably best to drop the "A,A,B,C,C,D" notation in favor of two characters per cell chosen from:
Using '<' or '^' means that a 1 will go in the corresponding spot in the product. '|-' means a 6 goes in. ']-' or '|u' means a 5 goes in. ']u' means either 4 or 5 goes in, depending on whether the loc above is '<' or ']' (if the loc above is '|', then you need to split the case into two separate products). The list of products for a single row now look like this: Read more... (1416 Bytes) We can sort the above items better. Note that the pattern is that we can't have two 1s adjacent and each row ends in either 5,6 or 1,5:
Let's add the next location, separating between "might violate first vertical constraint" from "can't violate first vertical constraint": Read more... (3 kB) Then the real fun begins after you decide whether to fill in the next item to the right or below. Note that you can re-use your list of 21 products in order to compute for a whole extra row in one sitting, now. But that is no small task. Let's just quickly look at trying to append our last product to itself:
Oops, we have to split that second case above into two cases already:
Lots of fun. Updated, very shortly after posting to replace a couple of wildly wrong calculations and to make one minor wording tweak. - tye | [reply] [d/l] [select] |
|
Re: Pattern enumeration.
by salva (Canon) on Jul 28, 2010 at 11:28 UTC | |
Read more... (8 kB)
It can solve the 6x6 case, with 6 symbols in two minutes in my computer:
Read more... (1347 Bytes)
(I believe) Its complexity is O(N22NS2N) where NxN is the grid size and S the number of symbols, so extrapolating, it should be able to solve the 8x8,6 case in Note that I had to use multi-precision arithmetic in order to store the pattern counters and that the results are printed in hexadecimal. Discovering the logic behind is left as an exercise to the reader ;-) Update: Complexity and time estimations corrected. Result for 7x7,6 included. | [reply] [d/l] [select] |
by BrowserUk (Patriarch) on Jul 28, 2010 at 11:58 UTC | |
It can solve the 6x6 case, with 6 symbols in two minutes in my computer.... it should be able to solve the 8x8,6 case in 5000 minutes, Um. 6x6x6 is 6^36; 8x8x6 is 6^64. 6^64 / 6^36 = 6140942214464815497216 *2 minutes = A very long time :) Discovering the logic behind is left as an exercise to the reader ;-) Of course, maybe it is a non-linear algorithm. But it will take me a while to discover the logic :) Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] |
by salva (Canon) on Jul 28, 2010 at 12:29 UTC | |
Um. 6x6x6 is 6^36; 8x8x6 is 6^64. 6^64 / 6^36 = 6140942214464815497216 *2 minutes = A very long time :) That would be true if I had used the brute force algorithm that has complexity O(SN2) or worse... but I had not used it!
My solution has complexity O(N22NS2N) so, roughly, t7x7,6 = k*72*27*62*7 = 139min, so k = 139/A and t8x8,6 = k*82*62*8 = 180551034077184*139/A = 13000min | [reply] |
by BrowserUk (Patriarch) on Jul 28, 2010 at 13:22 UTC | |
by salva (Canon) on Jul 28, 2010 at 20:29 UTC | |
| |
by BrowserUk (Patriarch) on Jul 30, 2010 at 15:44 UTC | |
by salva (Canon) on Aug 02, 2010 at 06:46 UTC | |
| |
|
Re: Pattern enumeration.
by ambrus (Abbot) on Jul 29, 2010 at 15:33 UTC | |
I guess there must be a way to calculate this precisely in a reasonable time, but if you only want an approximate answer, that's much easier. Just use sampling. I made the computer fill the board with random letters 10_000_000 times, and 896_728 out of those were correct (no three adjacent letters in a row or column). As there are 6.33e49 possible boards, this means there are approximately 5.68e48 correct boards. | [reply] |
by BrowserUk (Patriarch) on Jul 29, 2010 at 16:33 UTC | |
The problem is, it converges very slowly. After 1 billion trials, it has still only rendered (at most) 5 significant digits. And plus or minus 1 place in that 5th digit is still a stunningly large margin of error:
Another problem is that 1 billion (or even 1 trillion) trials is such a minuscule sample that I'm not at all convinced that even those 5 digits are accurate. That figure is produced using Perl's built-in rand, which on my platform equates to MSC's hopelessly inadequate 15-bit linear congruential generator, with its known cyclic flaws. I substituted Math::Random::MT and got substantially different results, but the MT, whilst a vastly better PRNG, is really very slow by comparison. Generating and testing an adaquate sample, say a quadrillion, would take a very long time also. Luckily, salva came to the rescue with a very clever piece of C code that with luck will render an exact answer within the next 10 or so hours. I'm also convinced that it can be done (very quickly) using purely numerical methods. Though it is considerably more complicated than I first thought. I just found a ridiculous typo(*) that had been hanging me up for two days. Maybe I'll make some progress now. (*)I have variables $x_1, $x_2, $y_1, $y_2 to represent the values 1 and 2 to the left; and 1 & 2 above the current position. I thought they were vaguely mnemonic for X-1 etc. Trouble is, in one place I typed $x-1 instead of $x_1 by accident, and whilst nothing appeared broken, the results were rubbish. I must have looked right past that typo a hundred times without spotting it! Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
by ambrus (Abbot) on Jul 29, 2010 at 17:07 UTC | |
Sure. It's a binomial distribution with probability of an individual board being good around 0.09, so the divergence of the ratio of good boards you get from n samples is around 0.3/sqrt(n), which is about 9e-6 for n=1e9, so you should expect only four correct significant digits. | [reply] |