maybeD's scratchpad

Public Scratchpad

Download, Select Code To D/L

The input is a numerical string like this GO:007983, and a text string e.g. 'transport'. Duplicate instances of both are present in the input array (so I can't just swop the keys and values round). The desired output is a listing of all of the duplicate numerical strings, along with their associated text strings, which I will print to file. I will probably also try to generate some simple statistics based on these, but that comes later.

 if ($overlap_cluster_terms_resplit_line =~ /^\s*(GO:\d+)\s\w+/g)
        { 
        push (@overlap_cluster_terms_nothashed_keysarray, $overlap_clu
+ster_desc);
        $overlap_cluster_terms = $1;
        $overlap_cluster_terms_hash{$overlap_cluster_terms} = $overlap
+_cluster_desc;
#  unless exists $overlap_cluster_terms_hash{$overlap_cluster_terms};
        if (exists $overlap_cluster_terms_hash{$overlap_cluster_terms}
+)
            {
            push (@overlap_debug, $overlap_cluster_desc);
            print OVERLAP_OUTPUT $overlap_cluster_terms;
            print OVERLAP_OUTPUT "\t";
            print OVERLAP_OUTPUT $overlap_cluster_desc;
            print OVERLAP_OUTPUT "\n\n";
            }
[download]

BTW: The $overlap_cluster_desc comes from another if statement within the foreach loop. Both not shown.