in reply to Best way to compare my data?

Maybe you want to split up the file into several files, then compare the differences between the several files? Along the way, you might find a way how to avoid the "split into several files" part and compare the differences from the same file. I'm not sure where you are having problems, so maybe if you show some code we can help you further.

One possible approach would be to use "paragraph mode" to read in the sections of your file, then split those sections into lines, then find the differences between the sections. Maybe you could use Algorithm::Diff to give you a human readable overview of what changed.

Replies are listed 'Best First'.
Re^2: Best way to compare my data?
by Lavezzi (Initiate) on Mar 28, 2010 at 15:37 UTC
    The only reason I would prefer not to split it up into different files is because there would then be 196 files * the 6 different logs that I have.

    I have uploaded my code to my scratchpad, but obviously it doesn't work, that's why I'm here. I'm a complete beginner with Perl (and coding in general, I only started in the last week), so please don't laugh at my effort, haha. Also I hope the formatting is OK since I'm new to that too!

      Why don't you post your code here instead?

      Instead of manually splitting up your source file, you could write a program to split up your source file and then compare the split up parts. You could also skip the part where you write out your input file into separate files and compare the split up parts in memory instead of writing them out to files and reading them back in.

      So far, I get the impression that you have not really put much thought into possible approaches. Maybe you shouldn't attack the problem as a whole, but instead simplify the problem first:

      • If you have two files, how do you determine their differences?
      • If you have one file with two sets of routes, how can you read that file into two memory structures?
      • If you have one file with more than two sets of routes, how will you determine the differences?
      I have uploaded my code to my scratchpad, but obviously it doesn't work, that's why I'm here.

      You mean this code?:

      #!/usr/bin/perl -w use strict; my $infile = 'JPStream.csv'; my $outfile = 'new1.csv'; open IN, "< $infile" or die "Can't open $infile : $!"; open OUT, "> $outfile" or die "Can't open $outfile : $!"; my %seen; my %seen2; while (<IN>) { next if /^$/; chomp; if ( ! $seen2{$_} ) { print OUT "$_ Not in the last Traceroute ^\n"; } last if /^$/; $seen{$_}++; %seen2 = (); } while (<IN>) { next if /^$/; chomp; if ( ! $seen{$_} ) { print OUT "$_ Not in the last Traceroute^^\n"; } last if /^$/; $seen2{$_}++; %seen = (); } }

      Why not post your code in the same node as your question? It certainly isn't due to length.

      Please make it easier for us to help you help yourself.

      HTH,

      planetscape