in reply to Looking for Printing all possible combinations

Sorry this isn't newbie friendly:

open my $f1, '<', \<<''; string1 (C)C(T)A string2 T(A)GG(A)GGG(G) open my $f2, '<', \<<''; string1 1 C A string1 3 T C string2 5 A T string2 9 G A string2 2 A C my %h = map split, <$f1>; tr/()//d, $_ = [split //] for values %h; while (<$f2>) { local $" = ','; my ($k, $i, @combo) = split; $h{$k}[$i-1] = lc "{@combo}"; } for my $k (sort keys %h) { local $" = ''; while (<@{$h{$k}}>) { s/([a-z])/(\u$1)/g; print "$k $_\n"; } }

Outputs:

string1 (C)C(T)A string1 (C)C(C)A string1 (A)C(T)A string1 (A)C(C)A string2 T(A)GG(A)GGG(G) string2 T(A)GG(A)GGG(A) string2 T(A)GG(T)GGG(G) string2 T(A)GG(T)GGG(A) string2 T(C)GG(A)GGG(G) string2 T(C)GG(A)GGG(A) string2 T(C)GG(T)GGG(G) string2 T(C)GG(T)GGG(A)

Replies are listed 'Best First'.
Re^2: Looking for Printing all possible combinations
by sarkar (Initiate) on Feb 14, 2015 at 00:42 UTC
    Hello Anonymous Monk,

    Thank you very much. I completely agree with you. Not at all newbie friendly. But I have one more question.

    Like for Example In File 1: I have given the string2 T(A)GG(A)GGG(G). But my original file has multiple brackets in different strings such as TAA(A)G(T)G(A)GGAG(G)CCA(A). An example provided below. What should I modify in the code? And also my File2 is really big with a million Entries.

    If my file1:

    string1 (C)C(T)A string2 T(A)GG(A)GGG(G) string3 T(A)GG(A)GGG(G)AAAAAAA(C)ACT(G) string4 TAA(A)G(T)G(A)GGAG(G)CCA(A)

    What would you suggest?

      Both this solution and my solution support any number of brackets. Why haven't you tried it?
      لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
        Hello choroba,

        Yes I have tried and the code you have given works perfectly with the example above. Thanks very much :) I very much appreciate.

        But the problem I am facing is In my File2, I have many more cases. For Example: string1 in File1 has two positions highlighted (1st and 3rd). However in File2, we may have more positions mentioned. For instance, we have "string1 20 A T" (3rd line), which is out of range and has to be omitted.

        File1:

        string1 (C)C(T)A string2 T(A)GG(A)GGG(G)

        File2:

        string1 1 C A string1 3 T C string1 20 A T /* This is out of range, and has to be omitted*/ string2 2 A C string2 5 A T string2 9 G A string2 30 A C /* This is out of range, and has to be omitted*/ string3 9 G A /* This string is not there in the main file.. so has to + be omitted */

        And in such case of File2, it`s not working. And I have a File2 of a billion lines.