I'm trying to write a little program that will ultimately weed out blast hits that are too similar. The idea is to have it read in a tab-delimited line of data, then split it up into an array and do the same for the next line of data. If the first columns are the same for both lines (arrays), push the array to another array, looping through until we have an array of arrays that have the same first column value. Then I'd like to compare the values in the 2nd column of the arrays in that array. If they are equal, we don't need to do anything, but if they are different, then the values in the second-to-last column must have a difference of 10. Here is the code so far (doesn't work yet):

use strict; use warnings; open(input0, "<e_d.txt"); open(output0, ">e_h.txt"); my $colNum=0; my $limit=10; my @arrayEquals; my $line = <input0>; WHILE: while($line ne undef) { s/\r?\n//; my @array = split /\t/, $line; my $followingLine = <input0>; my @followingLineArray=split /\t/, $followingLine; if( $array[$colNum] eq $followingLineArray[$colNum]){ print "match\n"; push (my @arrayEquals, @array); } else{ push (@arrayEquals, @array); for my $i(0 .. $#arrayEquals){ my $colNum=1 if ($arrayEquals[$i][$colNum] eq $arrayEquals[$i+1][$colNu +m]){ next; }else{ my $colNo=-2; if (($arrayEquals[$i][$colNo] - $arrayEquals[$i+1][$co +lNo]) < $limit) { #not enough difference so won't keep any of the lin +es in @arrayEquals $line = $followingLine; next WHILE; } } } print output0 @arrayEquals; #has difference, so keep the v +alues $line = $followingLine; } }

So for example, say I get to this part of my data (some column removed for simplicity):

KN-1791-LAST_rep_c7834 IMGA|Medtr4g125100.1 2e-139 497 KN-1791-LAST_rep_c7834 IMGA|Medtr4g125100.1 4e-46 187 KN-1791-LAST_rep_c7834 IMGA|Medtr4g125100.2 4e-46 187

I'd be trying to compare IMGA|Medtr4g125100.1-> 2e-139 and IMGA|Medtr4g125100.2-> 4e-46 (false positive) and then IMGA|Medtr4g125100.1-> 4e-46 IMGA|Medtr4g125100.2-> 4e-46 (not a difference of 10 so throw them all out) So far accessing the 2D arrays correctly are giving me trouble, but I think my bigger question and the reason for all the explanation would be that I feel like I'm not using everything perl has to offer, because I just don't know enough. So I was also wondering if there is a better way to do this. Thanks for any input


In reply to Accessing 2D array values and comparing by mSe

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.