Calebros has asked for the wisdom of the Perl Monks concerning the following question:

Hello everyone, I am very new coder trying to filter duplicate line records out of a .csv file by comparing two values on the current line to the previous line; however my code is returning that the variable $prev_l is uninitialized. Could someone tell me why this is happening?
#!/usr/bin/perl use warnings; use strict; while(<>){ my $hitlength; my $prevalence; my $prev_l; my $prev_p ; my $line = $_; chomp $line; if($line =~ /^[\d]/){ my @hitline = split(/,/ , $line); $hitlength = $hitline [4]; $prevalence = $hitline [65]; if($hitlength == $prev_l and $prevalence == $prev_l){ next; } else { print "$line\n"; } $prev_l=$hitlength; $prev_p=$prevalence; } }

Here are a couple sample lines of data
23750,57495,78362,xxxx,2853,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, +1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 +,1,1,1,1,60,100 23751,57497,78364,xxxx,2853,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, +1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 +,1,1,1,1,60,100 23752,57500,78367,xxxx,3114,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, +1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 +,1,1,1,1,60,100 23753,57502,78369,xxxx,3114,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, +1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 +,1,1,1,1,60,100 23754,57504,78371,xxxx,101,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 +,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, +1,1,1,1,60,100

Replies are listed 'Best First'.
Re: How to Save a variable outside of a loop
by RonW (Parson) on Jul 31, 2015 at 21:33 UTC

    By declaring your variables inside the loop, they are being redeclared each time the loop executes.

    If you move your declarations before the loop, it should work:

    #!/usr/bin/perl use warnings; use strict; my $hitlength; my $prevalence; my $prev_l; my $prev_p ; while(<>){
      Thank you very much Ron, you saved me a lot of hair pulling!
Re: How to Save a variable outside of a loop
by Myrddin Wyllt (Hermit) on Jul 31, 2015 at 21:41 UTC

    As well as the re-declared variables, it looks like you have a typo on the line:

    if($hitlength == $prev_l and $prevalence == $prev_l){

    This should probably be:

    if($hitlength == $prev_l and $prevalence == $prev_p){
      You are indeed correct, thank you very much!
Re: How to Save a variable outside of a loop
by poj (Abbot) on Jul 31, 2015 at 21:50 UTC
    You also need to skip the comparison for the first line of data.
    #!/usr/bin/perl use warnings; use strict; my $prev_l; my $prev_p; my $count = 0; ### add while(<>){ my $hitlength; my $prevalence; my $line = $_; chomp $line; if ($line =~ /^[\d]/){ my @hitline = split(/,/ , $line); $hitlength = $hitline[4]; $prevalence = $hitline[65]; if( ($count++) and ($hitlength == $prev_l) and ($prevalence == $prev_p) ){ next; } else { print "$line\n"; } $prev_l = $hitlength; $prev_p = $prevalence; } }
    poj
      You also need to skip the comparison for the first line of data.
      Or you could just give an initial dummy value to the two previous variables, thereby saving some steps within the loop:
      #!/usr/bin/perl use warnings; use strict; my ($prev_l, $prev_p) = ("", ""); while(<>){ my $hitlength; my $prevalence; my $line = $_; chomp $line; if ($line =~ /^[\d]/){ my @hitline = split(/,/ , $line); $hitlength = $hitline[4]; $prevalence = $hitline[65]; if( $hitlength == $prev_l and $prevalence == $prev_p ){ next; } else { print "$line\n"; } $prev_l = $hitlength; $prev_p = $prevalence; } }

        That'll yield a warning:

        c:\@Work\Perl\monks>perl -wMstrict -le "my $x = ''; print 'equal' if $x == 1; " Argument "" isn't numeric in numeric eq (==) at -e line 1.


        Give a man a fish:  <%-(-(-(-<

        Apart from the warning, you have to be certain that on the first line of data those 2 fields don't contain the dummy values you initialize to otherwise it will be treated as a duplicate and skipped.


        poj
Re: How to Save a variable outside of a loop
by GrandFather (Saint) on Aug 01, 2015 at 10:13 UTC

    As a general thing don't declare variables until you have an initial value for them. The use of prev_l and prev_p is an exception to that guideline where the operation of the code requires the two variables to be declared outside the loop.

    Nested blocks generally obfuscate code so try to avoid that. One powerful tool for reducing nesting is to use early exits. Look at the following code and see how next with if as a statement modifier is used to abort processing loop statements wherever there is no need to continue.

    Note too the use of defined to handle the first line correctly.

    #!/usr/bin/perl use warnings; use strict; my $prev_l; my $prev_p; while (my $line = <DATA>) { next if $line !~ /^[\d]/; chomp $line; my @hitline = split (/,/, $line); my $hitLength = $hitline[4]; my $prevalence = $hitline[65]; next if defined $prev_l && $hitLength == $prev_l && $prevalence == + $prev_p; print "$line\n"; $prev_l = $hitLength; $prev_p = $prevalence; } __DATA__ 23750,57495,78362,xxxx,2853,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, +1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 +,1,1,1,1,60,100 23751,57497,78364,xxxx,2853,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, +1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 +,1,1,1,1,60,100 23752,57500,78367,xxxx,3114,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, +1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 +,1,1,1,1,60,100 23753,57502,78369,xxxx,3114,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, +1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 +,1,1,1,1,60,100 23754,57504,78371,xxxx,101,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 +,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, +1,1,1,1,60,100

    Prints:

    23750,57495,78362,xxxx,2853,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, +1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 +,1,1,1,1,60,100 23752,57500,78367,xxxx,3114,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, +1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 +,1,1,1,1,60,100 23754,57504,78371,xxxx,101,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 +,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, +1,1,1,1,60,100
    Premature optimization is the root of all job security
Re: How to Save a variable outside of a loop
by CountZero (Bishop) on Aug 01, 2015 at 08:01 UTC
    Or use the state keyword to declare a persistent lexical variable inside the loop that will not reset once it has been declared.

    It is available in Perl 5.10 or later.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

    My blog: Imperial Deltronics
Re: How to Save a variable outside of a loop
by locked_user sundialsvc4 (Abbot) on Aug 01, 2015 at 02:12 UTC

    Just to clarify ... there are, basically, two errors at work here:

    The first error is that variables are declared inside the loop, that need to be declared outside.   Local-variables within a scope (such as “the scope of a while loop”) will be re-initialized each time.   Thus, the value from previous iterations are not retained.

    The second error is a little bit more subtle, since it is a logic error:   during the first iteration of the loop, (by definition) there is no “previous record” to be compared against.   Variables such as $prev_1 and $prev_p, therefore, will be undefined.   There are many ways to check for this and to handle it ... as long as you do.   The defined() function, for example, will check for undef.   Therefore, yet-another way to handle this case (in the present program), is to enclose the current if-statement within another if that tests whether (say...) $prev_1 is defined.