DS has asked for the wisdom of the Perl Monks concerning the following question:

Hi I have two files ,the first data.txt which contain the following:
me.pl~23~note~345~sente you.pl~25~warning~345~sente he.pl~21~note~345~sente she.pl~123~warning~345~sente they.pl~233~warning~345~sente them.pl~26~note~345~sente
where the fist number appear in each line is a line number in that code now in my second file which is lineChanges.txt , I have the following
you.pl 24-28 they.pl 36 them.pl 44-49 you.pl 234 77
which are lines that changed in a code, 24-28 means 24 to 28 ... so what I need to do is reading from the lineChanges.txt file each line and see if the .pl file appears in the first file data.txt , if it does then I want to see if the lines number matches or in the same range and it is a warning , if it match and it is a warning then I want to get the hole line and send it to a file . example you.pl appears in both files the lines for you.pl in the lineChanges.txt are 24 to 28 and in data.txt is 25 , line 25 which is in the range of 24-28 appear and it is a warning so that is what I am looking for so I copy the hole line
you.pl~25~warning~345~sente
to an output file .... can someone tell me how to do that or give me a hint :) thanks

Replies are listed 'Best First'.
Re: Reqex from two files and compare
by aersoy (Scribe) on Jul 17, 2002 at 04:02 UTC

    Hello,

    First of all, it would be easier to help you by showing things on the code you have written. Please keep this in mind when asking questions, because people usually want to see that you have at least tried.

    If I were you, I would iterate thru the lineChanges.txt first, and store the parsed data into a hash. Example:

    while (<CHANGES>) { chomp; # split the line from the spaces my ($file, $info) = split(/\s+/, $_, 2); # is it a range or a single line? if ($info =~ /(\d+)-(\d+)/) { $changes{$file} = [$1, $2]; } else { $changes{$file} = [$info]; } }

    (At this point you may want to look at what you have parsed so far. Try Data::Dumper.)

    Now you have to check for each line of data.txt in this hash. You may do it like this:

    while (<DATA>) { chomp; # split the line from the wave characters my ($file, $line, $type, $rest) = split(/~/, $_, 4); # was this file found in the previous loop? if (defined $changes{$file}) { # was it a range? if (defined $changes{$file}->[1]) { # skip if this $line is not in this range next unless ($line gt $changes{$file}->[0] and $line lt $changes +{$file}->[1]); } else { # skip if this $line is not equal to this single line number next unless $line eq $changes{$file}->[0]; } # if we are here, then this is a match. so print it. print "$_\n"; } }

    There are issues with this approach. For example, in your lineChanges.txt, there are two you.pl lines. The second one will overwrite the first one, because of the same file name being used as the hash key. You may overcome this by: 1- using a different storage method, 2- changing the order of the loops (ie. parsing data.txt into a hash). I leave this as an exercise to the reader.

    I hope this helps a bit.

    --
    Alper Ersoy

      thanks Alper ,,, sorry for not putting my code ,, thanks again :)
Re: Reqex from two files and compare
by graff (Chancellor) on Jul 17, 2002 at 04:54 UTC
    Aersol's plan was very much on the mark (++), and while he wanted you to work out the rest, I thought I would suggest one alternative that might simplify things a bit, and allow you to use multiple entries for the same perl file in the "changes.txt" listing. Where aersol had this:
    # is it a range or a single line? if ($info =~ /(\d+)-(\d+)/) { $changes{$file} = [$1, $2]; } else { $changes{$file} = [$info]; }
    you could have this instead:
    if ($info =~ /(\d+)-(\d+)/) { $changes{$file} .= " ".join(" ",($1 .. $2))." "; } else { $changes{$file} .= " $info "; }
    This way, each value in %changes is a string of space-separated line numbers, including every line that has changed in a given file, and by using ".=", the string just gets longer each time the same file is mentioned again in changes.txt. Now, when you check this against the "data.txt" listing, instead of aersol's two-way test, you have a single test for all cases:
    ... if (defined $changes{$file}) { # skip if this $line isn't mentioned in $changes next unless ($changes{$file} =~ / $line /); } ...
      nice one ;) , ,, thanks