searching two csv files

dreamer has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: searching two csv files by Joost (Canon) on Aug 04, 2005 at 09:38 UTC
What do you mean with "tabulate the difference"? What kind of output do you expect? Also, if you sort the csv files first, there is no nead to "check one record in cvs1 with all records of csv2" Without further information, I'd recommend: `> sort csv1 > csv1.sorted > sort csv2 > csv2.sorted > diff csv1.sorted csv2.sorted` [download] updated: s/cvs/csv/g "What should it profit a man, if he should win a flame war, yet lose his cool?"	[reply] [d/l]
Re^2: searching two csv files by blazar (Canon) on Aug 04, 2005 at 09:42 UTC
Also, a somwhat less known tool, but a very useful one in many circumstances is `comm`...	[reply] [d/l]
Re^3: searching two csv files by Joost (Canon) on Aug 04, 2005 at 09:44 UTC
perfect. I never knew that was there :-) "What should it profit a man, if he should win a flame war, yet lose his cool?"	[reply]
Re: searching two csv files by Limbic~Region (Chancellor) on Aug 04, 2005 at 12:32 UTC
dreamer, Welcome to the Monastery! This community is not in the habit of emailing people. We all benefit from questions and answers in an open forum. You should take a look at CSV table diff utility. It likely won't be a perfect solution but it should be a great start. Cheers - L~R	[reply]
Re: searching two csv files by blazar (Canon) on Aug 04, 2005 at 09:39 UTC
Which difficulties are you having? What have you tried so far? To extract the data I recommend Text::CSV_XS. In case of further queries about the problem statement feel free to mail me at karandeep_vohra@yahoo.com . Hint: this is not a personal helpdesk for you. You should read here for further queries and post here your replies to them for everybody to be able to read them so as to help us to help you.	[reply]
Re: searching two csv files by davidrw (Prior) on Aug 04, 2005 at 13:42 UTC
Depending on your requirements (is this a one-time thing, size of data, etc, and the 'difference' you want), you could use DBD::CSV (or in-memory tables with DBD::AnyData) and use SQL to JOIN the two tables looking for your desired results.. Something like (note this example is only cases where the id exists in both files--excercise for reader to use LEFT JOIN for the other cases): `my $sql = <<EOF; SELECT t1.id, SUM(t2.hrs_expected) - SUM(t1.hrs_expected) as hrs_expected_diff FROM file1 as t1 JOIN file2 as t2 ON t2.id = t1.id GROUP BY t1.id HAVING SUM(t1.hrs_expected) != SUM(t2.hrs_expected) EOF` [download]	[reply] [d/l]