A few picky comments:

unless (open(INPUT, "$inputfile")) { print "ERROR: Can't open file for reading : $!\n"; }

This would usually be written as:

open INPUT, '<', $inputfile or die "ERROR: Can't open file for reading : $!\n";
while(<INPUT>) {

You probably want to call chomp each time round the loop

my @data = split("\t", $_);

The first argument to split is a regular expression. Also, the second argument is optional and defaults to $_. So I'd write that as:

my @data = split /\t/;

Actually, if you omit the first argument then it defaults to /\s+/ which looks like it would work in this case.

my @data = split;

Then we have:

my $did1 = $data[0]; my $did2 = $data[1]; my $score1 = $data[2]; my $score2 = $data[3];

I'm can't really see the point of splitting that array into separate variables. If you want to make the code more readable, then I'd put the data into a hash.

my %rec; @rec{qw(did1 did2 score1 score2)} = @data;

In fact I'd do away with the need for @data completely and capture the output from split directly into the hash.

my %rec; @rec{qw(did1 did2 score1 score2)} = split;

The biggest problem I can see is with your use of DBI. You prepare your statements each time around the loop. Far more efficient to prepare them once and just execute them each time round the loop. You do this by putting placeholders in your SQL and passing the values to the call to execute.

# before the loop my $sth1 = $dbh->prepare("select d_id, c_id from d where d_id = ?"); my $sth2 = $dbh->prepare("select d_id, c_id from d where d_id = ?"); # Then within the loop $sth1->execute($rec{did1}); $sth2->execute($rec{did2});

The other thing I was thinking is that you could probably do a lot more of the work in the SQL and not have to do so much processing in your Perl code. But I'm afraid I don't have time to look at that in much detail right now.

--
<http://dave.org.uk>

"The first rule of Perl club is you do not talk about Perl club."
-- Chip Salzenberg


In reply to Re: Feedback Appreciated on text-parsing, SQL querying subroutine by davorg
in thread Feedback Appreciated on text-parsing, SQL querying subroutine by Angharad

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.