comment on

Hey guys,

I've been trying to figure out how select->can_read works, and if it's even what's causing my frustration.

The situation:

I currently have a cron job that runs nightly on a bunch of machines and sends the results via syslog to a central log server. On the log server, there's a script (script1.pl) that reads the syslog messages file and looks for certain keywords, then copies the messages out to different log files. For this example, I'll call one of those log files sub.log.

There's yet another script (script2.pl) that tails sub.log and all the other logs written by script1.pl, again looking for more specific keywords. If it finds a match, then it writes the message out to another log file (reduced.log).

The problem I'm seeing is that while everything gets successfully copied by script1.pl to sub.log, not everything gets copied to reduced.log - at least not right away.

The way script1.pl works is by doing:

open(TAIL, "tail -f /var/log/messages |") or die "etc";
while (<TAIL>)
{
  if(/type1/) { print SUBLOG1 $_; }
  if(/type2/) { print SUBLOG2 $_; }
  # etc
}
[download]

script2.pl, however, works by doing something like:

my $sel = IO::Select->new();
for(iterate through obj array with filenames)
{
  open($handle, "tail -f $obj_file |");
  $sel->add($handle);
}

while(@ready = $sel->can_read)
{
  foreach my $fh (@ready)
  {
    $line = <$fh>;
    if($line =~ /keyword/)
    {
      print REDUCEDLOG $line;
    }
  }
}
[download]

What ends up happening is that let's say that 15 lines get written to sub.log (I can verify this by tailing it from the terminal) and 5 of those match the keyword I'm looking for, only two of the lines will actually get written to reduced.log (the number "two" is made up, I haven't figured out a pattern yet). If I run a test script to write 15 additional lines to sub.log, now I will get the remaining 3 lines from the previous run in reduced.log, plus *some* from the latest run. So, my reduced.log file is never up to date with the data that's in sub.log, and if for whatever reason the script dies, then I "lose" that data.

At first I thought the problem had to do with the flushing of the buffer so I added autoflush to pretty much all the filehandles, to no avail. It seems as if the select is getting all the data, I'm just not processing it correctly or something :/

Thanks in advance!

Pedro

In reply to Stumped with select->can_read by w1r3d

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.