in reply to True Brute Force Longest Common Sub Sequence

If I understand your question correctly, you want to know how many subsequences are in a given sequence? So for a S1="abcde" you can have subsequences of a,ab,abc,abcd,abcde,b,bc,bcd,bcde,c,cd,cde,d,de,e. Is that right?

Then it's a straight function of length. Because a subsequence can start on any letter and run from between 1 character until the rest of the length long.

In the previous example the sequence is 5 long. And there are 5 subsequences beginning with a. There are 4 subsequences starting with b. 3 with c and so on. Number of subsequences: 5 + 4 + 3 + 2 + 1

So the number of subsequences is the sum from 1 to length. 1 + 2 + 3 + . . . + lengthOfString = # of subsequences.

--
I used to drive a Heisenbergmobile, but every time I looked at the speedometer, I got lost.
  • Comment on Re: True Brute Force Longest Common Sub Sequence

Replies are listed 'Best First'.
Re^2: True Brute Force Longest Common Sub Sequence
by Anonymous Monk on Nov 25, 2007 at 06:51 UTC
    Subsequencea can also be ac, acd, ace, ade, adh, etc. all possible permutations of string S1, should be 2^n possible subsequences. How would I obtain this using Perl?
      You could try this:

      $ cat 652805.pl #!/usr/bin/perl -w # # Generate all possible subsequences of sequence 1 # use strict; use warnings; # Sequence 1 my @seq_1 = ('a', 'b', 'c', 'd', 'e'); # Count of all possible subsequences my $s1_max = 2 ** @seq_1; print "Number of possible subsequences: ", $s1_max,"\n"; # To get each possible subsequence, use a binary number # as a mask to tell you which digits to print for my $seq (0 .. $s1_max-1) { print "Sequence ", $seq, ": "; my $bit = 0; while ($bit < @seq_1) { print $seq_1[$bit], " " if 2**$bit & $seq; ++$bit; } print "\n"; }
      Not the most efficient solution, but I tried to make it easy to read for you...

      ...roboticus

      No, it cannot.

      "Sequence" means that you cannot skip characters in your string. Otherwise it would just be permutations and not sequences.

      See BrowserUK's comments hereafter.

      CountZero

      A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

        "Sequence" means that you cannot skip characters in your string.

        What you are describing would be a contiguous sequence. Otherwise known as a 'substring'. Skipping characters is permitted, it is their relative positions that makes it a sequence.

        Contrast LC subsequence with LC substring.

        For the subsequence problem, the characters in the solution must

        • appear in both strings. (common)
        • appear in the same relative ordering in both strings and the solution. (sequence).
        • Be longer than any other set of characters that complies with the above two conditions. (longest)

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

      Hrm. I wouldn't call that a sequence, but ok. You problem is still a function of length and nothing more.

      It's the sum of 2 ^ (remaining length), from 1 to length. Since we can have any character on or off (essentially).

      So you have something length 5, for example. That'd be 1 * 2^0 + 2 * 2 ^1 + 3 * 2^2 + 4 * 2 ^ 3 + 5 * 2 ^4. Just looking at that sequence should give you a good idea of how to right a loop around it.

      --
      I used to drive a Heisenbergmobile, but every time I looked at the speedometer, I got lost.