comment on

I have an app which parses web server logs periodically and, to save on repeating work that it's already done, it does a tell on the filehandle at the end of each run, then seeks back to that position on the next run (unless the inode has changed):

open my $fh, '<', $filename or die "Can't open $filename: $!\n";
_restore_offset($filename, $fh);
while (my $line = <$fh>) {
  # do stuff here
}
_record_offset($filename, $fh);
close $fh;

sub _restore_offset {
  my ($filename, $fh) = @_;
  # Get $offset and $last_inode from database
  my $current_inode = (stat $fh)[1];
  return unless $current_inode == $last_inode;
  
  seek $fh, $offset, 0;
} 

sub _record_offset {
  my ($filename, $fh) = @_;

  my $offset = tell $fh;
  my $inode = (stat $fh)[1];
  # Stuff $offset and $inode back into database
}
[download]

This seems to work perfectly on my test system, where apache is mostly idle.

Moving to a more heavily-trafficked server, however, there are issues with the first line read in a new run being incomplete, with the first part of the line missing, presumably because the seek landed in the middle of the line. (I would blame this on log rotation if I weren't already explicitly checking for an inode change to catch that.)

What's the best/most straightforward way to deal with this (without defeating its purpose by always reading the file from the beginning)?

In reply to Reading only new lines from a file by dsheroh

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.