dreamer has asked for the wisdom of the Perl Monks concerning the following question:

Thanks for the Overwhelming reponse i am currently looking into the ways all of you suggested. As for the output i was looking for a file which gives me the followng as an answer. The record which isnt an exact match and where. Thanks a ton Karandeep

Replies are listed 'Best First'.
Re: searching two csv files
by Joost (Canon) on Aug 04, 2005 at 09:38 UTC
      Also, a somwhat less known tool, but a very useful one in many circumstances is comm...
Re: searching two csv files
by Limbic~Region (Chancellor) on Aug 04, 2005 at 12:32 UTC
    dreamer,
    Welcome to the Monastery! This community is not in the habit of emailing people. We all benefit from questions and answers in an open forum. You should take a look at CSV table diff utility. It likely won't be a perfect solution but it should be a great start.

    Cheers - L~R

Re: searching two csv files
by blazar (Canon) on Aug 04, 2005 at 09:39 UTC
    Which difficulties are you having? What have you tried so far? To extract the data I recommend Text::CSV_XS.
    In case of further queries about the problem statement feel free to mail me at karandeep_vohra@yahoo.com .
    Hint: this is not a personal helpdesk for you. You should read here for further queries and post here your replies to them for everybody to be able to read them so as to help us to help you.
Re: searching two csv files
by davidrw (Prior) on Aug 04, 2005 at 13:42 UTC
    Depending on your requirements (is this a one-time thing, size of data, etc, and the 'difference' you want), you could use DBD::CSV (or in-memory tables with DBD::AnyData) and use SQL to JOIN the two tables looking for your desired results.. Something like (note this example is only cases where the id exists in both files--excercise for reader to use LEFT JOIN for the other cases):
    my $sql = <<EOF; SELECT t1.id, SUM(t2.hrs_expected) - SUM(t1.hrs_expected) as hrs_expected_diff FROM file1 as t1 JOIN file2 as t2 ON t2.id = t1.id GROUP BY t1.id HAVING SUM(t1.hrs_expected) != SUM(t2.hrs_expected) EOF