What do you mean with "tabulate the difference"? What kind of output do you expect?
Also, if you sort the csv files first, there is no nead to "check one record in cvs1 with all records of csv2"
Without further information, I'd recommend:
> sort csv1 > csv1.sorted
> sort csv2 > csv2.sorted
> diff csv1.sorted csv2.sorted
updated: s/cvs/csv/g
| [reply] [d/l] |
Also, a somwhat less known tool, but a very useful one in many circumstances is comm...
| [reply] [d/l] |
dreamer,
Welcome to the Monastery! This community is not in the habit of emailing people. We all benefit from questions and answers in an open forum. You should take a look at CSV table diff utility. It likely won't be a perfect solution but it should be a great start.
| [reply] |
Which difficulties are you having? What have you tried so far? To extract the data I recommend Text::CSV_XS.
In case of further queries about the problem statement feel free to mail me at karandeep_vohra@yahoo.com .
Hint: this is not a personal helpdesk for you. You should read here for further queries and post here your replies to them for everybody to be able to read them so as to help us to help you. | [reply] |
Depending on your requirements (is this a one-time thing, size of data, etc, and the 'difference' you want), you could use DBD::CSV (or in-memory tables with DBD::AnyData) and use SQL to JOIN the two tables looking for your desired results.. Something like (note this example is only cases where the id exists in both files--excercise for reader to use LEFT JOIN for the other cases):
my $sql = <<EOF;
SELECT
t1.id,
SUM(t2.hrs_expected) - SUM(t1.hrs_expected) as hrs_expected_diff
FROM file1 as t1
JOIN file2 as t2 ON t2.id = t1.id
GROUP BY t1.id
HAVING SUM(t1.hrs_expected) != SUM(t2.hrs_expected)
EOF
| [reply] [d/l] |