Re^2: Need to have perl script to compare two txt files and print difference along with under which segment the difference is

Replies are listed 'Best First'.
Re^3: Need to have perl script to compare two txt files and print difference along with under which segment the difference is by johngg (Canon) on Jan 24, 2019 at 11:43 UTC
Without sight of the files you are comparing it is difficult to provide a solution. You mention segment names so is it safe to assume that each file contains the same segments but the contents of each segment may differ between files. If this is the case a better approach would be to break each file into segments and compare those, e.g. file `test1` segment `EFGH` compared to file `test2` segment `EFGH` rather than comparing the whole files. That way you can keep track of which segments differ. I hope my guess is close and this is helpful. Please post small example data files so that we can give better advice. Cheers, JohnGG	[reply] [d/l] [select]
Re^4: Need to have perl script to compare two txt files and print difference along with under which segment the difference is by User_04271983 (Initiate) on Jan 24, 2019 at 12:08 UTC
Thanks for reply.. And you guessed it right. Both files will have same segments but the contents of each segment may differ between files. I am posting sample data for reference tst1 `lodv OIDSCRIPT LCLABCG NMESCRIPT FRSJFGHT IT RCNGHSGD CURINR CRDUSWD OPWO7GNOxuXVog ODXCP ODXHC ODXIT APSN EJHFG sdmd DUUPPY MDPSJN PCINKSJ FXDEMAIJSKL1 FXCYYEJ EMCYOAK DLMDWJF IRRNKAJ` [download] contents of tst2 `lodv OIDSCRIPT LCLABCG NMESCRIPT FRSJFGHT IT RCNGHSGD CURINR CRDUSWD OPWO7GNOxuXVog ODXCP ODXHC ODXIT APSN sdmd DUUPPY MDPSJN PCINKSJ FXDEMAIJSKL1 FXCYYEJ EMCYOAK DLMDWJF IRRNKAJ IJFH LAKJSK` [download] In tst1 under segment "lodv" we have an extra record at last line "EJHFG" Same way, in tst2 under segment "sdmd" we have extra records "IJFH", "LAKJSK" So can we have difference records along with segments. Hope this gives bit more clarity on my question. Thanks in advance:)	[reply] [d/l] [select]
Re^5: Need to have perl script to compare two txt files and print difference along with under which segment the difference is by poj (Abbot) on Jan 24, 2019 at 13:51 UTC
Try #!/usr/bin/perl use strict; use warnings; my @file = ('tst1.txt','tst2.txt'); my %compare = (); # inputs for my $n (0..$#file){ parse_file($n); } # output diff for my $segment (sort keys %compare){ for my $row (sort keys %{$compare{$segment}}){ my $rec = $compare{$segment}{$row}; if (defined $rec->[0] && defined $rec->[1]){ # matched } else { printf "%s %s\n",$segment,$row; } } } sub parse_file { my ($n) = @_; my $filename = $file[$n]; my $segment; open IN,'<', $filename or die "Could not open $filename : $!"; while (<IN>){ s/\s+$//; # trim trailing whitespace if (s/^\s+//){ ++$compare{$segment}{$_}[$n]; } else { $segment = $_; } } close IN; } [download] poj	[reply] [d/l]
Re^6: Need to have perl script to compare two txt files and print difference along with under which segment the difference is by User_04271983 (Initiate) on Jan 24, 2019 at 14:00 UTC
Re^6: Need to have perl script to compare two txt files and print difference along with under which segment the difference is by User_04271983 (Initiate) on Jan 24, 2019 at 14:32 UTC
Re^7: Need to have perl script to compare two txt files and print difference along with under which segment the difference is by poj (Abbot) on Jan 24, 2019 at 15:33 UTC
Re^4: Need to have perl script to compare two txt files and print difference along with under which segment the difference is by User_04271983 (Initiate) on Jan 24, 2019 at 13:27 UTC
Can someone pls help..	[reply]
Re^5: Need to have perl script to compare two txt files and print difference along with under which segment the difference is by BillKSmith (Monsignor) on Jan 24, 2019 at 14:13 UTC
Please tell us more about your input files. Do they both have EXACTLY the same segments? What do we do if not? Are the segments always in the same order in both files? Do you know the segment names (or even the number of segments) in advance? How many differences do you expect to find in a pair of very large files? Your sample data has very short records. Is this typical? What is the typical (and max) number of records in one segment? How can we always tell a segment name from a data record? Is there anything else you know which might help us? Bill	[reply]
Re^6: Need to have perl script to compare two txt files and print difference along with under which segment the difference is by User_04271983 (Initiate) on Jan 24, 2019 at 14:22 UTC
Re^5: Need to have perl script to compare two txt files and print difference along with under which segment the difference is by haukex (Archbishop) on Jan 24, 2019 at 13:43 UTC
Hello and welcome to the Monastery, User_04271983. Please be patient - many of us have jobs and are spread out in different timezones. At first glance it looks like you've provided a good amount of information, I'm sure someone will get around to looking at it all.	[reply]