redbeard has asked for the wisdom of the Perl Monks concerning the following question:

This question may be a smidgen out of place, but as I am working in perl, I thought this a good place to go for the answer. By way of introduction, I'm working on a computer similation in perl of overgrowth in sessile colonial invertebrates on subtidal rock walls. I'm using an object oriented approach, so, Colony is one of my objects. Each colony has a list of cells it currently occupies. Currently its just an array of paired values. So here's the question:

As any individual cell the colony occupies may be overgrown, what is an efficient way to tell whether or not a particular instance of overgrowth has severd the formerly single colony into two new colonies? I feel like there must be an algorithm for this, or perhaps a more efficient data structure for the stroage of occupied cells than an array of paired values. Any suggestions/code snippets/etc?

Replies are listed 'Best First'.
Re: colony splitting
by Zaxo (Archbishop) on Mar 13, 2003 at 00:48 UTC

    If I understand the problem correctly, I think that Graph will help. Build a graph object with cells as vertices and insert edges to each neighbor cell containing the same type population. Then you will be able to use Graph's methods to extract and study connected components.

    After Compline,
    Zaxo

      Hrm - yeah, that seems as if it would help...I guess a recursive scroll through all successor vertices? One could then check to see if all successive vertices generated where in the array of listed vertices, and any leftover would be turned out into a new colony. This doesn't sound terrible efficient....although it may be the only way. I guess infinite looping could be avoided by making this a directed graph....although that would mean that if certain vertices are eliminated, new colonies that should be merged with the old would be created, which I'm not entirely comfortable with. If it were an undirected graph, that sort of algorithm would seem to have the potential to loop back on itself, which worries me a tad. Any suggestions?
        One question; and I hope it doesn't throw a big monkey wrench into the problem for you; but what about formerly seperated colonies re-connecting?

        a a a
        a b a -- time=0
        a a a


        a a a
        b b b -- time=1
        c c c


        a a a
        b a b -- time=2
        c c c


        If the newly formed colony is shown as seperate (as above), then at time=1 they split into 3 colonies, not 2... they then remain 3 colonies through time=2. If, on the other hand, inheritance is an important part of your simulation (genetic code, etc..); then they would be time(0)=2 colonies, time(1)=3 colonies, and time(2)=2 colonies again...that might make it a bit more difficult to establish exactly which colonies are actually seperate from time frame to time frame.


        Are the individual colonies tagged with their appropriate parentage? That might make it a little easier to determine when a previously seperate colony was reconnected.

Re: colony splitting
by Cabrion (Friar) on Mar 13, 2003 at 00:10 UTC
    Sounds a lot like cellular automata, Conway's "game of life".

    As for detecting splits in colonies, you will have to give a much better description of how they grow (like coral) and what consitiutes a split.

      It is a cellular automata model, in fact. As for how they grow, its a simple stochastic process where once a cell overgrows another, it has a probability of success. If it succeeds, then one more cell location is added to the colony object, and the cell in the environment is set to the value of the colony. As to what constitutes a split, it is if two groups of cells which were once contiguous are no longer contiguous. E.g. I have two "species" in the following 3x3 environment
      time=0 a a a b a b a a a time=1 a a a b b b a a a
      So, at t0, we have three colonies in the system, at t1, we have four, as the initial colony of species a is now two colonies due to the central link being overgrown by species 2, and there remain two colonies of b. Of note is that this is the similar to the following situation:
      time=0 a a a a b a a a a time=1 a a a b b b a a a
      where the colony b in the center has expanded in two directions and severed both links. At t1 we have three colonies. As a sidenote as to why this is important to the simulation, colonies have aggregate properties depending on their size. Were I to not keep track of individual colonies, later dispersed cells would get the benefits of a larger colony size, even if they are no longer connected.

      UPDATE: Note that in situation 1, there are three colonies at t0 and four at t1, not two at t0 as I had previously stated. The two cells labeled b are seperate colonies and do not merge at t1. Sorry for the confusion.
        Hi readbeard,
        Please explain, how do you calculate the number of colonies in a given environment.
        You mentioned in the first 'situtation':
        • Time t0 => 2 colonies
        • Time t1 => 3 colonies
        This is only possible if you don't count single cell as one colony. Please verify that it is the truth.

        From my perspective, in the first situation at t0 and t1, both have 3 colonies.

        • Time t0, we have 7 => 'a' , 1 => 'b' and 1 => 'b'
        • Time t1, we have 3 => 'a', 3 => 'b' and 3 => 'a'
        If my perspective is correct, then your problems is that how many 2-D groups you can find for given 'environment' at different timings. There may be well known algorithms for this. What I recall is the TileFall perl Program from PerlPress. See how the initial groups are calculated and you may find your algorithm from there.

        I also suggest to research Graphical algorithm index and to get some valuable hints.

        artist

        Do you also need to deal with this and more complex variations of the same?
        time=n c a a b c a b b c