thickice97 has asked for the wisdom of the Perl Monks concerning the following question:

I have question in perl.

I have file whose format is

David|Masters|3 2000 10 500 Harry|Undergrad1|4 4000 23 1000
I need the first column value (i.e David). The 3rd column has 4 values in it. I need just the first, 3rd value from the third column (e.g 3 and 10). These 3 values I need to match with the first column, second column and the fourth column from the second file. ANd pull out the mismatch rows.

Second file format:

David |3 |2000 |10 |500 |Masters |Histor +y |English | harry |2 |4000 |12 |0 |Undergrad |Scienc +es |Math | harry |1 |4000 |23 |1000 |Undergrad |Mat +h |History |
Please advise.

Here there could be spaces in the values of the columns

Replies are listed 'Best First'.
Re: Matching two files
by dragonchild (Archbishop) on Nov 08, 2007 at 19:55 UTC
    What have you tried to solve the problem and where did you go wrong? Can you even read the files? What data structures would you expect to use? How would you get the data from the file to the data structure?

    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
      I thought of using the cutting the necessary columns and using diff. But because of first file format. I would need additional processing for the fields. And as there are spaces in the values which I need to handle. Do you think if I store each of the files in a array and then compare will that be good? But how do I handle the spaces and a string of values stored in a column.
        Then why don't you show us how you're reading the first file, how you're parsing the fields, how you're storing the information, etc.?

        If you show us what you've tried, we can make suggestions on how to complete your homework assignment.

Re: Matching two files
by gamache (Friar) on Nov 08, 2007 at 20:12 UTC
    I would do it something like this:
    my %f1; while (<FILE1>) { chomp; my @fields = split /\|/; my @col3 = split /\s+/, $fields[2]; $f1{lc $fields[0]} = [$col3[0], $col3[2]]; } while (<FILE2>) { chomp; my @fields = split /\s*\|\s*/; my $name = lc $fields[0]; print "line $.: Name $fields[0] "; if (exists $f1{$name}) { if ($fields[1] == $f1{$name}[0] && $fields[3] == $f1{$name}[1] ) { print "matched.\n"; } else { print "found but not matched.\n"; } } else { print "not found.\n"; } }
    Which, in your case, would give output of:
    line 1: Name David matched. line 2: Name harry found but not matched. line 3: Name harry found but not matched.
    But I don't really know what you are trying to accomplish, or whether you want case insensitivity (remove the lc's if you don't).
    A reply falls below the community's threshold of quality. You may see it by logging in.