bisimen has asked for the wisdom of the Perl Monks concerning the following question:

I was trying to write a code that generates all possible combination of a word with X length and with any string of number/letters

This is what I came up with. So, this will work for any array of numbers or letters. But it will only do all possible combinations of them within a word length of 3. If say, I wanted a word length of 5, I would need to go an add more counts into the while loop, following the same pattern. I could do a while loop for a word length of 3, one for 4, one for 5... etc, but this will get messy and ugly in the end...

Any way to improve it? Or, is it hopeless and this is something I should be doing very differently? Thanks

use warnings; @array = qw(A T C G); $word_length = 3; $max = ($#array+1)**$word_length; $mainc = 0; $count1 = 0; $count2 = 0; $count3 = 0; while ($mainc != $max){ print $array[$count1]; print $array[$count2]; print $array[$count3]; $count1++; if ($count1 == $#array){ $count1 = 0; $count2++; } if ($count2 == $#array){ $count2 = 0; $count3++; } if ($count3 == $#array){ $count3 = 0; } print "\n"; $mainc++; }

Replies are listed 'Best First'.
Re: More effective way of doing this
by holli (Abbot) on Oct 21, 2017 at 14:41 UTC
    That's a faq.
    Algorithm::Permute::permute { print "@array\n" } @array;


    holli

    You can lead your users to water, but alas, you cannot drown them.
Re: More effective way of doing this
by haukex (Archbishop) on Oct 21, 2017 at 15:08 UTC
Re: More effective way of doing this
by Laurent_R (Canon) on Oct 21, 2017 at 15:43 UTC
    You've been given good answers already, but here's how you could do it if you wanted to write the code rather than using a module or a built-in such as the glob function:
    $ perl -e 'use strict; > use warnings; > > my @array = qw(A T C G); > my $length = 5; > my @result = @array; > > add_to_strings($length - 1); > print "@result"; > > > sub add_to_strings { > my $len = shift; > return if $len <= 0; > my @temp; > for my $string (@result) { > push @temp, $string . $_ for @array; > } > @result = @temp; > add_to_strings ($len - 1); > } > ' AAAAA AAAAT AAAAC AAAAG AAATA (... many string omitted for brevity ... +) GGGGA GGGGT GGGGC GGGGG
    Here, the recursive call to add_to_strings is the key for generating arbitrarily nested loops.

    You could do it without recursion, but it's likely to be a bit more complicated.

    Also note that I have used here a global @result array for simplicity, but you could also pass @result in subroutine calls and returns to make it cleaner.

    Update at 15:51 UTC: this is what the cleaner version might look like:

    use strict; use warnings; my @array = qw(A T C G); my $length = 5; my @result = add_to_strings($length - 1, @array); print "@result"; sub add_to_strings { my $len = shift; my @temp_result = @_; return @temp_result if $len <= 0; my @temp; for my $string (@temp_result) { push @temp, $string . $_ for @array; } add_to_strings ($len - 1, @temp); }
    You don't really need the @temp_result here, you could traverse directly @_, I just used @temp_result for better clarity, but this entails quite a bit of extra array copies, which might be inefficient for longer strings.
Re: More effective way of doing this
by LanX (Saint) on Oct 21, 2017 at 15:03 UTC
    Not sure if I understand your question ...

    Like this  @input = glob '{A,C,T,G}' x 5 ?

    For instance: (debugger demo)

    DB<4> x glob '{A,C,T,G}' x 2 0 'AA' 1 'AC' 2 'AT' 3 'AG' 4 'CA' 5 'CC' 6 'CT' 7 'CG' 8 'TA' 9 'TC' 10 'TT' 11 'TG' 12 'GA' 13 'GC' 14 'GT' 15 'GG'

    Cheers Rolf
    (addicted to the Perl Programming Language and ☆☆☆☆ :)
    Je suis Charlie!

Re: More effective way of doing this
by Laurent_R (Canon) on Oct 21, 2017 at 18:02 UTC
    Other monks and myself have already provided some solutions.

    But I think you might benefit from couple of comments on your code.

    Any way to improve it?
    Yes.

    First, you should have this:

    use strict;
    near the top of your script and declare your variables with the my function.

    Second, you should try to properly indent your code. The computer doesn't care, but human readers do.

    Third, using variables named count1, count2 and so on is usually a red flag. You should probably use an array of values. This is not sufficient to make your algorithm able to handle word lengths other than 3, but it is certainly a precondition to make this possible (within the context of your code).

    So, as a first step, your code could be rewritten as follows:

    use warnings; use strict; my @array = qw(A T C G); my $word_length = 3; my $max = ($#array+1)**$word_length; my @count; $count[$_] = 0 for 1..$word_length; my $mainc = 0; while ($mainc != $max){ print $array[$count[$_]] for 1..$word_length; $count[1]++; if ($count[1] == $#array){ $count[1] = 0; $count[2]++; } if ($count[2] == $#array){ $count[2] = 0; $count[3]++; } if ($count[3] == $#array){ $count[3] = 0; } print "\n"; $mainc++; }
    Now, if you look at the three if statements, you might be able to use a counter (in a loop) as an index for the @count array instead of hard coding the indices as I have done above, thereby making it possible to manage strings of arbitrary lengths.

    You should probably try to do it by yourself and avoid looking at the solution below:

    Note: I really prefer the solution in my previous post because it is much easier to check the result (because it is in logical order). And I don't quite understand what your're doing in your program.

    Update at 18:53 UTC: I just noticed that the result is not correct, because the algorithm of your original program is not correct (see this post: Re: More effective way of doing this below). I won't attempt to fix it, though, because, as noted earlier, I don't really understand how your original program is supposed to work.

Re: More effective way of doing this
by Laurent_R (Canon) on Oct 21, 2017 at 18:51 UTC
    Hi bisimen,

    your program does not work properly even for 3-letter strings. Sorting the output shows that you have duplicates as well as missing words (only showing here words starting with the letter A):

    AAA AAA AAA AAC AAC AAT AAT AAT ACA ACA ACA ACC ACC ACT ACT ATA ATA ATA ATC ATC ATT ATT (...)
    I did not fully understand your algorithm, but thought you had tested it and that it was correct.