comment on

I have a few suggestions for improvement to your code, I'll outline the general ones then get more specific.

These are based on my personal opinion, please feel free to review my review.

IMHO, a general rule is that the data type you use should closely represent the way you want to use the data in your code. In your code, I would would "use vars" for variables that are global in nature, but that are still going to be modified by your code. Constants are things which should never change, and will stay the same through-out the execution of your program. I believe you should "use constant" for this type of data. For example, your defaults at the top of the script should be constants, not global variables..
The data source is the root of all data structures. The data structure you use has different names than the data source. Why bother to remember that "last" maps to "LastReputation" in the database? If you change the names in your data structure to match your database column names, your code becomes simpler.

Calling the same thing by two different names is sure to cause confusion later on.
I noticed you asked for comments regarding readability. Often when I am not sure, I refer to perlstyle for style guidelines. It's not perfect, but it seems to cover the general consensus of the perl community.
Also, another tip that I use when trying to name my object variables, I refer to the module's docs for their naming conventions. For example, say $sth to anyone who's worked with DBI and they'll instantly know what you are talking about. Say $r and anyone who's work with mod perl will tell you it's the Apache Request object. About the only place I break this is with CGI, I don't like using $q and prefer to use $cgi, I think it's more self describing.
It's usually safer to do checks on hashes using exists and/or defined, rather than just doing the equivalent of a if($hash{key}) {}. You'll eliminate some possibility of unitilized value warnings.
When defining a hash or array, and you mean for it to be empty, it's redundant to do this: my @array = ();, when you can just do: my @array;. This is more personal preference, but I believe it's more readable without this redundancy.
I notice that you tend to use if(! $test) {} instead of unless ($test) {}. I usually try to stay away from these types of constructs, in general. It can get you into some nasty logic mind-benders, where you have code that's doing the equivalent of an english "double negative".

All through grade school my English teachers always told me not to do this, but I never really quite understood what they meant, until I began coding..
Inside your code there is a temporary variable, that has the prefix $out, which is the output variable that gets emailed and/or printed to STDOUT. I think it might be a better idea to do your formatting all in a single place, rather than mixing your logic and presentation code. You could have a hash that represents state, and place your data into it, then right before you send the mail or print, you run it through a filter to format it. I've used HTML::Template in an almost indentical situation as this, and it works wonderfully. This won't work very well is if the amount of data is large, but I believe it should in your case.

This seperation of logic and presentation allows you to output the data in any format you like, even HTML. And the resulting code becomes much cleaner and easier to understand.
You should change the code so that it either uses a database or a flat file. Right now your code is always using a flat file, and sometimes using a MySQL database. Why not use one or the other? It would simplify your code somewhat, as you'd create a single $dbh database handle, and pass it around to your read_file(), write_file(), and db_update() routines. Your -d flag should allow people to decide between csv or MySQL, with the default being csv of course.

Keep in mind the above suggestions are snippets from my own personal style. While it's not the one true way, I've found it usually does not conflict with the general consensus on perlmonks and CPAN. As well, I've never been burned by the suggestions above or below - that usualy happens only when I ignore my own guidelines. =)

The place where you check $arg{d} you create a synthetic variable called @nodelist, then proceed to push the results of a loop onto it. You could have done a direct assignment, or even better not created the variable in the first place.

In the call to update_replist() I think it would be clearer if it returned a hash reference, equivalent to %dbreplist. IMHO, passing in a reference to something into a subroutine, only to have it be modified in the subroutine can have the same unintended side effects that people face when using global variables. In general, I think if you can help it, don't do it. In this case, there are a few places where it might be required, so place the hash reference as the final parameter to this subroutine, so it can be optionally passed in.

Here's the block using the suggestions above:
```
if(exists $arg{I}) {

  my $hreplist = initialize_rep_file($username, $password, $filename);

  db_update( update_replist('I', $hreplist, [keys %$hreplist]) )
    if exists $arg{d};

  exit;
}
[download]
```

read_file/write_file:

Both of these routines do things that are already in DBD::CSV. Granted, there is a little bit of increased over-head, but the code inside the module is documented and standardized.

Using DBD::CSV read_file becomes: (Note, the name should probably be changed to read_source or something similar)

sub read_file {
  @_ == 2 or croak 'Incorrect number of parameters';

  my $dbh    = shift;
  my $source = shift;

  my $statement = sprintf(q{
      SELECT NodeId
           , Title
           , Date
           , LastReputation
           , Reputation
        FROM %s
    },
    $source,
  );

  my $sth = $dbh->prepare($statement);
  $sth->execute;

  my %nodehash;

  while(my $row = $sth->fetchrow_hashref) {
    croak "Node ID $row->{NodeId} is duplicated!"
      if exists $nodehash{ $row->{NodeId} };

    $nodehash{ $row->{NodeId} } = $row;
  }

  return \%nodehash;
}
[download]

Now if it is changed to using DBD::CSV, it might even be possible to factor out write_file(), and use dbupdate() (below) for both functions, but I will leave that for jcwren to decide.

dbupdate:
- The SQL query here should be written to use placeholders and be prepared outside of the loop.
- You can simulate the the or croak behaviour using the RaiseError parameter in DBI.
Here's an example of the resulting code:
```
sub db_update {
  @_ == 2 or croak 'Incorrect number of parameters';

  my $dbh      = shift;
  my $hreplist = shift;

  my $statement = sprintf(q{
      INSERT
        INTO %s
             (%s)
      VALUES (%s)
    },
    DEF_DBTABLE,
    join(', ', @{ &DB_COLUMNS }),
    join(', ', ('?') x @{ &DB_COLUMNS }),
  );

  my $sth = $dbh->prepare($statement);

  foreach my $nodeid (sort keys %$hreplist) {
    $sth->execute(@{ $hreplist->{$nodeid} }{ @{&DB_COLUMNS} };
  }
}
[download]
```
In reply to (dkubb) Re: (2) An exercise in "That's not the way I'd do it" (TNTWIDI) by dkubb
in thread An exercise in "That's not the way I'd do it" (TNTWIDI) by jcwren

Title:
Use: <p> text here (a paragraph) </p>
and: <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

For: Use:

& &

< <

> >

[ [

] ]
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`