Matching two files real time

chkin has asked for the wisdom of the Perl Monks concerning the following question:

Hello Folks,

Iam caught up in a tough situation with time based comparison of two files.

Eg:
File1:
117-0,1161310019795596,939,8,2000,5340,8888
119-0,1181310019795766,939,8,2000,5340,8888
120-0,1201310019795931,939,8,2000,5340,8888
File2:
222-0,1161310019795596,939,8,2000,5340,8888
333-0,1181310019795766,939,8,2000,5340,8888
120-0,1201310019795931,939,8,2000,5340,8888
[download]

Logic is to compare the two files (continuously updated). If a difference if found store the results in a temp file and set a flag (add a extra column in temp file say F) continously scan the files and update the flag if a match is found (change flag from "F" to "P") . However , if the record fails to match till 10 mins . generate alert.

Would highly appreciate your help. Thanks in Advance.

Comment on Matching two files real time Download Code

Replies are listed 'Best First'.
Re: Matching two files real time by cjb (Friar) on Jul 27, 2011 at 13:48 UTC
File::Tail should be a good place to start for monitoring the 2 continuously updated files, and consider using something like DBD::SQLite for the temp file. Add every row that doesn't match into the DB with a time stamp, delete or flag the row in the DB if a match turns up later. It's a fairly simple query against the DB (perhaps from a separate alerting app) to report on any rows which have a timestamp older than now - 10 minutes.	[reply]
Re: Matching two files real time by Anonymous Monk on Jul 27, 2011 at 13:47 UTC
For starting point see Re: search text file, search text file	[reply]
Re: Matching two files real time by zentara (Cardinal) on Jul 27, 2011 at 17:17 UTC
My first thought would be just to take md5sums of the files and see if they match. That is a fast operation, and won't tax the system. But it assumes the lines will be in the same position, maybe you can sort them first, before doing the md5sum? If they don't match, open them and take a diff. See Seeking guidance on how to approach a task and Algorithm::Diff and here is a simple method using the hash count method described above. `#!/usr/bin/perl use strict; use warnings; open (FILE1, '<', 'file1.txt') or die "Unable to open file1.txt for re +ading : $!"; open (FILE2, '<', 'file2.txt') or die "Unable to open file2.txt for re +ading : $!"; my %lines; while ( <FILE1> ) { chomp; $lines{$_}++ } while ( <FILE2> ) { chomp; $lines{$_}++ } open (FILE3, '>', 'file3.txt') or die "Unable to open file3.txt for wr +iting : $!"; for ( keys %lines ) { next if $lines{$_} > 1; print FILE3 "$_\n"; }` [download] I'm not really a human, but I play one on earth. Old Perl Programmer Haiku ................... flash japh	[reply] [d/l]
Re: Matching two files real time by ww (Archbishop) on Jul 27, 2011 at 16:05 UTC
On asking for help How do I post a question effectively?	[reply]
Re: Matching two files real time by jpl (Monk) on Jul 27, 2011 at 17:42 UTC
There's a CPAN module that might be of value: CHI-0.49. I haven't used it myself, but I was directed to it by the maintainer of a different caching module (Tie-Cache-LRU-20110205). The fact that it supports time-based expiration seems to match nicely with your problem.	[reply]