steph_bow has asked for the wisdom of the Perl Monks concerning the following question:

Dear Wise and Serene Monks,

Could you please fix my code ?

I have made a code (inspired by your answers to a previous question (searching for strings) which I tried to adapt to a new case).

However, it does not seem to work.

Here is my start file :

VIR19;VIR20;VIR;B744; VIR19;VIR207;VIR;B744; VIR19;VIR21;VIR;B744; VIR19;AF34B;VIR;B744; VIR19;AF34C;VIR;B744; VIR19;IB580F;VIR;B744; VIR19;IB581F;VIR;B744; VIR19;AL34BC;VIR;B744; VIR19;AL34AC;VIR;B744; VIR19;ADH00A;VIR;B744; VIR19;ADH01A;VIR;B744; AL34BC;IB580F;VIR;B744; AL34BC;IB581F;VIR;B744; AL34BC;AL34BC;VIR;B744; AL34BC;AL34AC;VIR;B744; AL34BC;ADH00A;VIR;B744; AL34BC;ADH01A;VIR;B744;

The aim of my code is to detect the elements of the second column whose names are close to the names of the elements of the first columns (but they must be different)

I would like to have as an output file :

AL34BC;AL34AC;VIR;B744; VIR19;VIR20;VIR;B744;

But not for example the lines :

AL34BC;AL34BC;VIR;B744; (because they are the same name) VIR19;VIR21;VIR;B744; (because the names are too far)

Here are the boundaries :

1)there are no negative numbers

2)when a string ends by Z or 99, there is no need to look for AA, or 100, ...

3)the length of the string of the couple looked for must be the same (it excludes cases such as the couple ABF99 / ABF100)

Here are the different possibilities of couple (what I define by "close")

VIR19, VIR20 (the names differ only by theit last numbers and the difference must be 1)

AF34B,AF34C ... (the pattern is two letters then two numbers and a letter at the end of the string : the difference must be 1 in the alphabet order)

ADH00A,ADH01A ... (the pattern is three letters then two numbers then a letter : the difference lays on the numbers and must be 1)

IB580F,IB581F ... (the pattern is three numbers and a letter at the end of the string the difference lays on the numbers and must be 1)

AL34BC and AL34AC (the pattern is two numbers and two letters at the end of the string : the difference lays on the first on the two letters and this difference must be 1 in the order of the alphabet)

Here is my code

#!/usr/bin/perl use strict; use warnings; use diagnostics; use Cwd; use Fcntl qw(SEEK_SET); my $Current_Dir = getcwd; print STDOUT "the current directory is $Current_Dir"; ##################################################################### my $file = "$ARGV[0]"; open INFILE_1, '<', $file or die "Can't open '$file' : $!\n"; ###################################################################### +##### my $outfile = "analysis_file"; open OUTFILE, '>', $outfile or die "Can't open '$outfile' : $!\n"; # the hash tables my %decrement; my %increment; # Synonymous with existance. my $start_of_DATA = tell INFILE_1; while (my $line = <INFILE_1>) { chomp($line); my @ELEMENTS = split(/;/,$line); # suppresses the /n of the current line my $Key; my $Suffix_1; my $Suffix_2; my $Suffix_next; my $item_next; my $item = $ELEMENTS[0]; # for example : VIR19, VIR20, VIR207 (only letters then two or th +ree numbers) # the suffix will be the two numbers if ( $item =~ m{\A ([A-Z]+) (\d\d | \d\d\d) \z}x) { $Key = $1; $Suffix_1 = $2; # to see in which test the data pass print STDOUT "1\n"; ($Suffix_next = $Suffix_1)++; $item_next = $Key . $Suffix_next; $increment{$item} = $item_next; $decrement{$item_next} = $item; } # L'expression if ... elsif ... else permet d'enchaîner une série d'in +structions et évite d'avoir à imbriquer des instructions if. # for example : AF34B,AF34C ... (two letters then two numbers and + a letter at the end of the string) # the suffix will be the two numbers and the last letter # because ++($foo = 'B'); => C elsif ($item =~ m{\A ([A-Z][A-Z]\d\d)([A-Z]) \z}x){ $Key = $1; $Suffix_1 = $2; print STDOUT "2\n"; ($Suffix_next = $Suffix_1)++; $item_next = $Key . $Suffix_next . $Suffix_2; $increment{$item} = $item_next; $decrement{$item_next} = $item; } # for example : ADH00A,ADH01A ... (three letters then two numbers + then a letter) # the suffix will be the two numbers # because ++($foo = 'B'); => C elsif ($item =~ m{\A ([A-Z][A-Z][A-Z])(\d\d)([A-Z]) \z}x){ $Key = $1; $Suffix_1 = $2; $Suffix_2 = $3; print STDOUT "3\n"; ($Suffix_next = $Suffix_1)++; $item_next = $Key . $Suffix_next . $Suffix_2; print STDOUT "$item_next\n"; $increment{$item} = $item_next; $decrement{$item_next} = $item; } # for example : IB580F,IB581F ... (three numbers and a letter at +the end of the string) # the suffix will be the three numbers # because ++($foo = '580'); => 581 elsif ($item =~ m{\A (\w+) (\d\d\d) ([A-Z]) \z}x){ $Key = $1; $Suffix_1 = $2; $Suffix_2 = $3; print STDOUT "4\n"; ($Suffix_next = $Suffix_1)++; $item_next = $Key . $Suffix_next . $Suffix_2; print STDOUT "$item_next\n"; $increment{$item} = $item_next; $decrement{$item_next} = $item; } # for example : AL34BC and AL34AC (two numbers and two letters a +t the end of the string) # the suffix will be the three numbers # because ++($foo = '34B'); => 34C elsif ($item =~ m{\A (\w+) (\d\d[A-Z]) ([A-Z]) \z}x){ $Key = $1; $Suffix_1 = $2; $Suffix_2 = $3; print STDOUT "5\n"; ($Suffix_next = $Suffix_1)++; $item_next = $Key . $Suffix_next . $Suffix_2; print STDOUT "$item_next\n"; $increment{$item} = $item_next; $decrement{$item_next} = $item; } else { die "Invalid data: $_"; } # Reparse DATA seek(INFILE_1, $start_of_DATA, SEEK_SET); while (my $line_DATA = <INFILE_1>) { chomp($line_DATA); my @ELEMENTS_DATA = split(/;/,$line_DATA); my $item_DATA = $ELEMENTS_DATA[1]; print STDOUT "le item de DATA est $item_DATA\n"; print STDOUT "le item est $item\n"; if ($item eq $decrement{ $item_DATA } ) { print OUTFILE "$item;$item_DATA\n"; } elsif ($item eq $increment{$item_DATA } ) { print OUTFILE "$item;$item_DATA\n"; } print OUTFILE "\n"; } #End of the first WHILE } close INFILE_1; close OUTFILE;

UPDATE_1

change of the place of the line of code

my $start_of_DATA = tell INFILE_1;

UPDATE_2 : change of the code in the part :

print STDOUT "le item est $item\n"; if ($item eq $decrement{ $item_DATA } ) { print OUTFILE "$item;$item_DATA\n"; } elsif ($item eq $increment{$item_DATA } ) { print OUTFILE "$item;$item_DATA\n";

UPDATE_3 : used the beacons br instead of the beacons Br to post my request

UPDATE_4 : the messages of error are

use of uninitialized value in string eq at Monks_Treatment_Reg_Slot_List.pl line 157

Replies are listed 'Best First'.
Re: finding the right corresponding element
by toolic (Bishop) on Aug 13, 2007 at 13:49 UTC
    In trying to understand your problem, I first downloaded the code you provided and ran it through perltidy to neaten up the indentation, so as to make it easier to follow your logic.

    It is also helpful to me to execute the code in order to duplicate your problematic output (which you did not provide). However, I gave up after the 1st three compile and runtime errors. It is great that you are using the strictures, but you either are ignoring them, or the code you posted is not the same as what you are running.

    Please post new code which compiles cleanly. If you get some error messages you do not understand, please post those as well. These simple steps will make it much easier for others to help you.


      Dear toolic,

      I have taken into account your remarks : thanks for your advice

      the code I have updatd is exactely the same I just ran, so you can download it now safely, provided the name of the file that contains the data is "file"

      Moreover, I have indicated in my updates the messages of error / warning
        OK. That helps.

        I'm not sure why you are getting the warning on line 157. I suspect the problem may be somewhat related to your use of "tell" and "seek". I do not have much experience with these functions. It looks like you are reading the entire input file multiple times (once per time through your outer "while" loop).

        Is it really necessary to do this? Would it not be simpler and more efficient to slurp the contents of the input file into an array variable once, then use a "for" loop to process that array? Is input file size an issue?

Re: finding the right corresponding element
by dogz007 (Scribe) on Aug 13, 2007 at 16:07 UTC
    I've conjured up a nice short solution for you. It breaks the appropriate columns into chunks based on alpha- or numeric-ness and then compares them.

    #! /usr/bin/perl use strict; while (my $line = <DATA>) { chomp $line; my @line = split ";", $line; next if $line[0] eq $line[1]; next unless length $line[0] == length $line[1]; my @one = breakup($line[0]); my @two = breakup($line[1]); next unless scalar(@one) == scalar(@two); my @match = map { $one[$_] eq ++$two[$_] || ++$one[$_] eq --$two[$_] ? 1 : 0 } (0..$#one); print $line, "\n" if 1 == grep {$_} @match; } sub breakup { my @out; while ($_[0]) { $_[0] =~ s/^([a-zA-z]|[0-9]+)//; push @out, $1; } return @out; } __DATA__ VIR19;VIR20;VIR;B744; VIR19;VIR207;VIR;B744; VIR19;VIR21;VIR;B744; VIR19;AF34B;VIR;B744; VIR19;AF34C;VIR;B744; VIR19;IB580F;VIR;B744; VIR19;IB581F;VIR;B744; VIR19;AL34BC;VIR;B744; VIR19;AL34AC;VIR;B744; VIR19;ADH00A;VIR;B744; VIR19;ADH01A;VIR;B744; AL34BC;IB580F;VIR;B744; AL34BC;IB581F;VIR;B744; AL34BC;AL34BC;VIR;B744; AL34BC;AL34AC;VIR;B744; AL34BC;ADH00A;VIR;B744; AL34BC;ADH01A;VIR;B744;

    Prints:

    VIR19;VIR20;VIR;B744; AL34BC;AL34AC;VIR;B744;
      Proposed code improvement-
      if(defined $holiday_season) { if(defined $weekend) { $midwest_flight_to_northeast++; $price--; } }
Re: finding the right corresponding element
by oxone (Friar) on Aug 13, 2007 at 15:27 UTC
    Hi Steph - Should we be concerned that your profile says you work in air traffic control, and that currently you have bugs in your code for processing flight numbers??

    Perhaps the monks better rally round quick if this is a matter of life and death ;-)

      Thanks a lot for your concern, oxone, you are really nice

      No, my problem is not a matter of life and deah

      I make studies about flight events that happened months ago

      So I try with my collegues to highlight patterns, propose improvements, ...

      I reassure you, it is not a case of emergency

      Thanks a lot, I really appreciate your kindness.