dr_jgbn has asked for the wisdom of the Perl Monks concerning the following question:

Hello Perlmonks,
I cannot seem to solve this simplistic piece of code...
I have 2 files; 1 is a list of numbers e.g. 85.11 etc. The second is a list of names and numbers, tab-delimited.
e.g. John Doe 85.4
John Dear 84.2
etc.
What I wish to do is take each number from file 1 and search against file 2. Anything number in file 2 within plus or minus 0.5 of the number in file 1 will cause the name and number of that match to be printed. For example, grab 85.11, it is within 0.5 of 85.4 and not 84.2 so only print out John Doe etc.
Here is my feeble attempt thus far.

while (<FH1>) { @num=<FH1>;} close FH1; foreach $num (@num) { chomp $num; while (<FILE2>) { @marks = split(/\t/, $_); $x=$marks[0]; $y=$marks[1]; if ($num - $y < 1) { print OUT "$x\n"; } } }
Thanks for you help,
dr_jgbn

Replies are listed 'Best First'.
Re: simple search and print
by borisz (Canon) on Feb 24, 2004 at 23:41 UTC
    #!/usr/bin/perl open my $fh, "<file1" or die $!; chomp ( my @num = <$fh> ); open my $fh2, "<file2" or die $!; while ( defined ( $_ = <$fh2> ) ) { chomp; my ( $name, $num ) = split /\t/; for ( @num ) { print "$name\n" if $_ - .5 <= $num && $_ + .5 >= $num; } }
    Boris
Re: simple search and print
by Limbic~Region (Chancellor) on Feb 24, 2004 at 23:34 UTC
    dr_jgbn,
    For example, grab 85.11, it is within 0.5 of 85.4

    Last time I checked, 85.4 + .5 is 85.9 which is still less than 85.11. I will assume for now that was just a typo.

    Since you do not say how large these files are, I will assume memory is no object. Since you do not say if the tab delimited file has more than two fields, I will assume that the name is in the first field and the second field is the number. Finally, I will assume run-time speed is not critical.

    #!/usr/bin/perl use strict; use warnings; open (NUMBERS, '<', "numbers.txt") or die "Unable to open numbers.txt +for reading : $!"; open (NAMES, '<', "names.txt") or die "Unable to open names.txt for re +ading : $!"; my %numbers = map {chomp; $_ => 1} <NUMBERS>; my %names = map {chomp; split /\t/ , $_ , 2} <NAMES>; for my $name ( keys %names ) { my $names_number = $names{$name}; for my $number ( keys %numbers ) { my $delta = abs $number - $names_number; if ( $delta <= .5 ) { print "$name : $names_number is within range of $number\n" +; last; } } }
    Cheers - L~R

      Uhh, 85.11 is less than 85.9 or even 85.4. However 85.4 - .5 = 84.9 so 85.11 does fall with +/- .5 of 85.4.

      Decimals not version numbers.

      Later

        pzbagel,
        Well isn't a good thing that I went ahead and wrote the code anyway? For some reason my brain was not processing the decimal place correctly. It was seeing 4 + 5 = 9 < 11 so wrong. The code should however work regardless of my confusion.

        Thanks - L~R

      I'll note that in what you posted above there's no reason to read NAMES into memory - in fact, if the order of names in NAMES is significant, what you do will cause a problem.

      When I read the post, my impression was that the order of values in NUMBERS was what the poster wanted preserved, and so would have said:

      #!/usr/bin/perl use strict; use warnings; open (NUMBERS, '<', "numbers.txt") or die "Unable to open numbers.txt +for reading : $!"; open (NAMES, '<', "names.txt") or die "Unable to open names.txt for re +ading : $!"; my @namevals = map {chomp; [split /\t/ , $_ , 2]} <NAMES>; while (<NUMBERS>) { s/\s//gs; my $current = $_ + 0; map {print $_->[0]," matched $current\n" if (abs($_[1] - $current) <= 0.5);} @namevals; }
      If runtime is an issue, namevals could always be split into a hash of lists based on the floor() of the given value - then, when comparing, you'd only compare against the two lists that contained namevals near the target value.