Scotmonk has asked for the wisdom of the Perl Monks concerning the following question:

I have this code which reads a data file and compares values between lines of that data
it reads from the top of the file to the bottom. How do I change this to read from the bottom of the file (last entry) to the top (first entry) ?
while (@lines > 12) { # check first 7 lines against the 8th line my @vals = split /\s+/, $lines[12]; my @chk = split /\s+/, join(' ', @lines[0..11]); my %match; foreach my $val (@vals) { $match{$val}++ if grep { $val eq $_ } @chk; } my @match = sort keys %match; my @nomatch = grep { not exists $match{$_} } @vals; my $match = @match; my $nomatch = @nomatch; #my $vals = @vals; #my $lines = @lines; #print NUMBER "$lines @lines \n\n"; if ($match == 0) { $zero ++ } if ($match == 1) { $one ++ } if ($match == 2) { $two ++ } if ($match == 3) { $three ++ } if ($match == 4) { $four ++ } if ($match == 5) { $five ++ } if ($match == 6) { $six ++ } # do whatever you want with the matches and no-matches print MATCH "$line: \tmatch = @match\n"; print NOMATCH "$line: \tnomatch = @nomatch\n"; $line++; # get rid of first line so next loop will be 2-6 and so on shift @lines; }

Replies are listed 'Best First'.
Re: Reversing the action of the while loop
by haukex (Archbishop) on Nov 24, 2019 at 19:48 UTC

    Is this the same question as Reading the contents of a file from the bottom up? If so, then it would have been better to post it in a single thread.

    If your file is small enough to be read entirely into memory, then note that you can iterate over an array in both directions:

    use warnings; use strict; my @array = qw/ abc def ghi /; for my $x (@array) { print "$x\n"; } for my $x (reverse @array) { print "$x\n"; } while (@array) { my $x = shift @array; print "$x\n"; } @array = qw/ abc def ghi /; while (@array) { my $x = pop @array; print "$x\n"; }

    See reverse, shift, and pop. However, unless the files are short, reading an entire file into memory is usually less efficient than reading a file line-by-line, and stevieb showed one solution to reading the file backwards in the other thread.

    Also, note that your if-elsif chain would probably be better written as a hash. Basically, $hash{$match}++ would create a hash where the keys are the digits, and the values are the counters. If you wanted to have the keys be named after the digits, like your variables, a second hash would help, e.g. my %digits = ( 1=>'one', 2=>'two', ... ); ... $hash{$digits{$match}}++. Also, in your code you're using fixed array indicies ($lines[12] and @lines[0..11]), but you're shrinking the array with shift, so that won't be accessing the same array elements. You can use $#array to get the index of the last element of the array (-1 if it is empty), and you can say $array[-1] to access the last element of the array, $array[-2] for the second-to-last, etc.

    If you want help with your algorithm, please show some sample input and the expected output for that input, along with a runnable piece of code, in a single node. See also How do I post a question effectively? and Short, Self-Contained, Correct Example.

      Thankyou both for that and the advice. I am a complete newbie to this PERL programming.

      I have statistical work to do, which I had tried to do with excel, but PERL seems to give me more flexibility and simplicity, also I have upto 10 million samples which Excel cant handle in one go.

      I am a neuroscientist working on the random generation of values in which we are trying to mimick a model from nature. The more data samples I have, the more acurate my calculations are.
      I can generate the random elements great, but processing them is tricky while trying to learn PERL at the same time.
      I have the 7th edition camel book, and as it says in there, as a beginner I am struggling with the efficiency of the written code.

      Also, as you have recognised, part of my problem is asking a question in a way that others understand.

      The last two questions that I asked are similar but I was thinking I might try the code in two different ways. Maybe betterjust to tell you what I am trying to do in a new post as you suggest.
        Although mostly irrelevant to the Perl-oriented topic here, it is worth noting that Excel does have the ability to include database queries ... so called Power Query or Get and Transform. So, there are ways to handle database-sized things directly in Excel, such as a dataset of 10 million rows.
          A reply falls below the community's threshold of quality. You may see it by logging in.
Re: Reversing the action of the while loop
by GrandFather (Saint) on Nov 24, 2019 at 19:54 UTC

    What is the bigger problem you are trying to solve? Most likely you can do it by reading the file forwards and change your test to suit.

    Alternatively, if the file is small (less than say a few hundred megabytes) you could simply read lines into an array then run through the array backwards.

    The more you tell us about the big picture problem you are trying to solve the more we are likely to be able to help.

    Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
      The problem, I feel, is complex but perhaps you guys are more practiced in dealing with computers so it might be easier.

      The difficult part of this is trying to make sense to you guys
      I have lines of data elements in a .vim file (this can vary between 5 data elements per line and 25)
      Each element is distinct from the others, regarded as a different sample

      I have been reading one line of data and comparing it with the previous x number of lines of data, looking for matches in value
      (which I have code for and can provide below)

      I now have to be more specific, in that I have to read the individual data from each line, and eliminating any replication, keep adding new values until I have x amount of values.

      so for example:

      my input data would look something like this

      1 2 6 4 5
      6 7 8 9 10
      1 2 11 12 13
      6 14 15 16 17

      If I chose to read 13 elements of data, i would read
      1 2 6 4 5 6 7 8 9 10 1 2 11

      Now I need to remove duplicate values, in this case one of the 1, one of the 2, and one of the 11 (but values could be triplicated or more)

      so that would result in

      1 2 4 5 6 7 8 9 10 11

      This final line would then be compared with the next complete unread line in the file, with matches printed to one file and nonmatches printed to another as per the code below.

      (so the match file would read)
      6
      (and the nonmatch file would read)
      14 15 16 17

      Now there is actually another step to what I do on paper that I would like the PERL script to perform

      Initially I had chosen 13 values (and this can vary between 5 and upto 80)

      If it is the case that duplicates are removed, I dont have the 13 values that I originally planned for

      so on paper, I continue to read forward until I have my required number of values filled in (without duplicates)

      even if that means reading into more lines of data, and moving forward the line that I will eventually compare with

      Does this make any sense at all ? :)

      thankyou
      my $line = 4; while (@lines > 3) { # check first 3 lines against the 4th line my @vals = split /\s+/, $lines[3]; my @chk = split /\s+/, join(' ', @lines[0..2]); my %match; foreach my $val (@vals) { $match{$val}++ if grep { $val eq $_ } @chk; } my @match = sort keys %match; my @nomatch = grep { not exists $match{$_} } @vals; my $match = @match; my $nomatch = @nomatch; # do whatever you want with the matches and no-matches print MATCH "$line: \tmatch = @match\n"; print NOMATCH "$line: \tnomatch = @nomatch\n"; $line++; # get rid of first line so next loop will be 2-6 and so on shift @lines; }

        Here's my guess at something like what you want. Instead of printing @match and @nomatch at the end, just write their contents to the proper files, and change the input to read from your input file.

        #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11109150 use warnings; my $x = 9; my %uniques; my @match; my @nomatch; while( <DATA> ) { print; for my $element ( split ) { if( keys %uniques < $x ) { $uniques{ $element }++; } else { if( $uniques{ $element } ) { push @match, $element; } else { push @nomatch, $element; } } } } print "\nMATCH:\n@match\n\nNOMATCH:\n@nomatch\n"; __DATA__ 1 2 6 4 5 6 7 8 9 10 1 2 11 12 13 6 14 15 16 17

        Outputs:

        1 2 6 4 5 6 7 8 9 10 1 2 11 12 13 6 14 15 16 17 MATCH: 1 2 6 NOMATCH: 11 12 13 14 15 16 17

        Hi Scotmonk,

        This is a great use-case for using an ordered hash (like a hash but with key insertion order preserved). Fortunately, such a module exists. For this demonstration, I'm more interested in the hash keys, not the hash values (i.e. hash key-driven implementation).

        # https://www.perlmonks.org/?node_id=11109163 use strict; use warnings; use feature 'say'; use Hash::Ordered; tie my %elems, 'Hash::Ordered'; tie my %match, 'Hash::Ordered'; tie my %nomatch, 'Hash::Ordered'; my $skip_duplicates = 0; # set to 1 to skip duplicates my $num_elements = 13; # number of elements to read my $num_read = 0; # number of elements read my @lines = <DATA>; chomp @lines; # read elements while ( @lines && $num_read < $num_elements ) { my $line = shift @lines; foreach my $elem ( split / /, $line ) { if ( $skip_duplicates ) { $num_read++ unless exists $elems{ $elem }; } else { $num_read++; } $elems{ $elem } = undef; last if $num_read == $num_elements; } } say "data elements"; say join(' ', keys %elems); # matched, not matched if ( @lines ) { foreach my $elem ( split / /, shift @lines ) { ( exists $elems{ $elem } ) ? $match{ $elem } = undef : $nomatch{ $elem } = undef; } say "match"; say join(' ', keys %match); say "nomatch"; say join(' ', keys %nomatch); } else { say "no more lines"; } __DATA__ 1 2 6 4 5 6 7 8 9 10 1 2 11 12 13 6 14 15 16 17 2 18 19 20 21

        Output: $skip_duplicates = 0

        data elements 1 2 6 4 5 7 8 9 10 11 match 6 nomatch 14 15 16 17

        Output: $skip_duplicates = 1

        data elements 1 2 6 4 5 7 8 9 10 11 12 13 14 match 2 nomatch 18 19 20 21

        Regards, Mario