in reply to Enlightenment

It's not as complicated as it looks; it would be much easier to understand if COMPARE didn't use $L as an LoL reference, and %L as a hash.

The main part of the program builds up 2 LOL (lists of lists, see perldsc). @R contains lists where the 5th element isn't --, and @L contains those where the 5th element IS and 2nd element isn't.

Note that there's no error checking for cases where both the 2nd and 5th elements are --, I don't know if that matters to you.

The 2 LOL's are then passed into the COMPARE routine. By the way, the

use Data::Dumper;
looks like a red herring; I don't see anything there that needs that module.

In compare, $L and $R are references to the LOLs built up in the main routine. %L and %R are hashes that map all of the indexes in @$L and @$R respectively to (anything, since the key existence is all that matters), to keep a set of the rows that still need to be processed.

The nested for loops go over every possible combination of items in @$R and @$L, checking if the 4th element in the @$R entry matches the 2nd in @$L. In that case, and if that entry in @$L hasn't already been matched (that's the next unless part), those entries are deleted from %R and %L (so they don't get printed out later), and the "match" line is printed out. The "last;" call ends the $J loop, and moves processing on to the next item in @$R.

After the 2 loops, all the unmatched entries are printed out.

Is it clearer now, or did I make it worse? :)
--
Mike

Replies are listed 'Best First'.
Re: Re: Enlightenment
by johnirl (Monk) on Aug 28, 2002 at 13:29 UTC
    Thanks Mike it's much clearer,
    that really helped but it's the syntax in the COMPARE subroutine thats giving me trouble. Is there any chance you could explain what each line in COMPARE is doing?
    Sorry about asking you to explain what may be simple code but I'm still learning. :-) Hopefully quickly.

    j o h n i r l .

    Sum day soon I'Il lern how 2 spelI (nad tYpe)

      Sure, I've added the comments in the code here. This will make a lot more sense if you read the perldsc perldoc page first, though.
      # pass in _references_ to the arrays of lists @L and @R # each element is the list of fields on a given line COMPARE(\@L,\@R); sub COMPARE { # $L is a reference to @L, $R is a reference to @R my( $L, $R ) = @_; # @Ret isn't used, it's just here to confuse you :) my @Ret = (); # %L is a hash with an entry for every index in @L # so we can skip lines already matched, and print # unmatched lines at the end my %L = map { $_ => $_; } 0..$#$L; # %R is a hash with an entry for every index in @R my %R = map { $_ => $_; } 0..$#$R; for my $I(0..$#$R ) { for my $J(0..$#$L ) { # if the @R line matches the @L line, based on # the key fields if($R->[$I]->[4] eq $L->[$J]->[1]) { # skip if we've already processed this @L # line next unless exists $L{$J}; # delete these lines from %L and %H so we # don't print them at the end delete $L{$J}; delete $R{$I}; # print fields 5 and 6 of the R line, and # fields 2 and 3 of the L line print join ',', @{ $R->[$I] }[4,5], @{ $L->[$J] }[1,2] +; # go to next R line; this R line is already # matched last; } } } # print out the unmatched R lines, printing only fields # 5 and 6 print join ',', @{ $R->[$_] }[4,5,0,0] for keys %R; # print out the unmatched R lines, printing only fields # 2 and 3 print join ',', @{ $L->[$_] }[0,0,1,2] for keys %L; }

      --
      Mike