In the original posting of the problem, Corion was trying to tell the original author (who I am assuming was you, posting anon by mistake) to read this excellent post: "I know what I mean. Why don't you?", which is about the importance of posting a *self-contained* example of the problem. I appreciate that you provided a more detailed explanation of the problem this time; that extra info is what allowed me to give your problem any attention at all. However, if you had provided the *data* that the code is intended to process, I could have finished this in 1/3 of the time. If you had further provided the data as part of the Perl code, so that other monks could just copy your code into a file and run `perl pm_588610.pl`, then ten other monks would have solved your problem and posted answers twice this good over 30 hours ago!

I have solved your problem, but the following advice is *much* more important than that solution. Place these lines at the top of all your Perl programs (or right below the #!/usr/bin/perl line, if you have one):

use strict; use warnings; #use Data::Dumper; $Data::Dumper::Useqq = 1; $| = 1; #print Dumper( \%some_var );

Now, to your problem; there are several design bugs that need addressing. Everything below is based on my *guess* as to your data format. If I guessed wrong, then much of my example code and re-written code is wrong, but the info on the design bugs is probably still valid.

  1. In this code:
    my @tmp; if ($match){ push @tmp, @line[@sorted_cols]; print "@tmp\n"; } push @arraynew,[@tmp];
    you are pushing an empty anon array onto @arraynew when the the match has failed! To fix, move the last line up into the if block:
    if ($match){ my @tmp = @line[@sorted_cols]; print "@tmp\n"; push @arraynew, [@tmp]; }
  2. Each time you loop through a condition, your code expects @{$arrayref} to contain scalar text lines with embedded tabs; the first time through, @{$arrayref} contains the expected data, but then you push array references onto @arraynew, and make $arrayref a reference to @arraynew at the end of each loop, making your data invalid for all future passes. Further, even if your code was expecting sometimes-string-sometimes-array, you are changing the order of the fields, so the wrong fields will be tested against on the next loop, and the final field order would depend on how many conditions were looped through! To fix, push the original line instead, and do the field reshuffling after you are done filtering the lines. Change this (already changed to this state in the step above):
    if ($match){ my @tmp = @line[@sorted_cols]; print "@tmp\n"; push @arraynew,[@tmp]; }
    to this:
    if ($match){ push @arraynew, $_; print "$_\n"; }
    , and at the bottom, add:
    my @final_result = map { [ (split "\t")[@sorted_cols] ] } @{ $arrayref };
  3. You declare my @arraynew *before* the start of your main loop, then alias $arrayref to it. Since $arrayref is the inner loop array, and you append to @arraynew inside the inner loop, you will be constantly appending a new line to the inner loop array. Your program will run forever, unless you have only a single condition in %conwithposition, or you have no data lines that pass the first condition. To fix this, just move the line down; Change this:
    my @arraynew; foreach my $key_pos ( keys %conwithposition ) {
    to this:
    foreach my $key_pos ( keys %conwithposition ) { my @arraynew;
  4. This *may* not be a bug, but your code ref $inputval_ref is checking for *scalarref* vs arrayref. While that is a legitimate design, it is uncommon; more common would be *scalar* vs arrayref. It is counter-intuitive, but ref() of a scalar value is *not* 'SCALAR'.
    $ perl -e 'use Data::Dumper; my $x = "foo"; my $x_ref = \$x; print Dum +per ref($x), ref($x_ref);' $VAR1 = ''; $VAR2 = 'SCALAR';
    If you are using scalars, and not scalarrefs, in %conwithposition, you can fix the problem by changing this:
    if (ref $inputval_ref eq 'SCALAR') { return 1 if ($valuesrc eq $$inputval_ref);
    to this:
    if (ref $inputval_ref eq '') { return 1 if ($valuesrc eq $inputval_ref);
  5. As tinita noted in the original thread, this is quite hard to read:
    my ($op, $arg1, $arg2) = (@{$conwithposition{$key_pos}}[1],@{$conwithp +osition{$key_pos}}[2],$line[@{$conwithposition{$key_pos}}[0]]);
    If we take her her excellent code and refactor it further, we can write it as:
    my ( $arg2_pos, $op, $arg1 ) = @{ $conwithposition{$key_pos} }; my $arg2 = $line[ $arg2_pos ];
    Since $conwithposition{$key_pos} does not vary within the inner loop, we can hoist the first line up one level, improving clarity and efficiency.

Below are two working, tested programs (Well, working with *my* data). The first is your original code, with bugs fixed as I described above, my mock data added, and undeclared vars declared with my(). The second is my own code, written the way I think of the problem. Since your outermost goal is to filter @array down to the lines meeting some criterion, I used the Perl function grep, which is intended for exactly that filtering task. My code also takes advantage of the newfound roominess in is_match() to implement the missing $operator functionality.

#!/usr/bin/perl use strict; use warnings; use Data::Dumper; $Data::Dumper::Useqq = 1; $| = 1; my @sorted_cols = ( 2, 0, 1, 3 ); my %conwithposition = ( TITLE => [ 0, 'eq', [ 'Mr.', 'Mrs.', 'Ms.' ] ], HANDLE => [ 2, 'eq', 'BeginR' ], ); # TITLE, FIRST_NAME, HANDLE, LAST_NAME my @lines = ( "Mr.\tNice\tBeginR\tPerson", "Senator\tNice\tBeginR\tPerson", "Mr.\tBruce\tUtil\tGray", ); my @array = @lines; my $arrayref =\@array; foreach my $key_pos(keys(%conwithposition)) { my @arraynew; #print "KEYLOOP::(@{$conwithposition{$key_pos}}[1],@{$conwithposit +ion{$key_pos}}[2])\n";#,$line[@{$conwithposition{$key_pos}}[0]])\n"; my $i=0; #print "ARR:@arraynew\n"; file:for (@$arrayref) { my $match = 0; my @line = split /\t/; #print "Line:::@$arrayref\n"; my ($op, $arg1, $arg2) = (@{$conwithposition{$key_pos}}[1],@{$ +conwithposition{$key_pos}}[2],$line[@{$conwithposition{$key_pos}}[0]] +); $match = is_match($op,$arg2, $arg1); if ($match){ push @arraynew, $_; } $i++; } $arrayref = \@arraynew; } my @final_result = map { [ (split "\t")[@sorted_cols] ] } @{ $arrayre +f }; print Dumper \@final_result; sub is_match{ my ($operator,$valuesrc,$inputval_ref)=@_; my $i; if (ref $inputval_ref eq ''){ return 1 if ($valuesrc eq $inputval_ref); } if (ref ($inputval_ref) eq 'ARRAY'){ my @valuetomatch=@$inputval_ref; for $i(0 .. $#valuetomatch){ return 1 if ($valuesrc eq $valuetomatch[$i]); } } return 0; }

#!/usr/bin/perl use strict; use warnings; use Data::Dumper; $Data::Dumper::Useqq = 1; $| = 1; my @sorted_cols = ( 2, 0, 1, 3 ); my %conwithposition = ( TITLE => [ 0, 'ne', [ 'Lord', 'Senator' ] ], HANDLE => [ 2, 'eq', [ 'BeginR' ] ], ); # TITLE, FIRST_NAME, HANDLE, LAST_NAME my @lines = ( "Mr.\tNice\tBeginR\tPerson", "Senator\tNice\tBeginR\tPerson", "Mr.\tBruce\tUtil\tGray", ); # The %conwithposition hash does not *have* to be a hash, # and would be easier to work with if it was an array. # The "values to match" variable can be a scalar # or an arrayref; not only to we have to determine which # it is and handle it differently every time it is used, # we also cannot directly match against an array; # it would be better in hash form. # Transform from a HoA # where the type of last element varies between scalar and arrayref # $field_name => [ $field_position, $operator, $value or \@values ] # to an AoA # where the last element is always a hash. # [ $field_name, $field_position, $operator, \%values_to_match ] my @conditions_with_position = map { my $field_name = $_; my ( $field_position, $operator, $value_to_match ) = @{ $conwithposition{$_} }; my $type = ref $value_to_match; my @values = ( $type eq '' ) ? ( $value_to_match ) : ( $type eq 'SCALAR' ) ? ( ${ $value_to_match } ) : ( $type eq 'ARRAY' ) ? ( @{ $value_to_match } ) : die "Unexpected type '$type'" ; my %val_hash = map { $_ => 1 } @values; [ $field_name, $field_position, $operator, \%val_hash ]; } keys %conwithposition; my @checked_lines_AoA = map { [ @{$_}[@sorted_cols] ] } grep { matches_all_fields( $_, \@conditions_with_position ) } map { [ split "\t", $_ ] } @lines; print Dumper \@checked_lines_AoA; sub matches_all_fields { my ( $line_aref, $conditions_aref ) = @_; foreach my $condition( @{ $conditions_aref } ) { my ($field_name, $field_pos, $op, $vals_href) = @{$condition}; my $field = $line_aref->[$field_pos]; return if not matches_one_field( $field, $op, $vals_href ); } return 1; } sub matches_one_field { my ( $line_field, $operator, $vals_to_match_href ) = @_; my $match = $vals_to_match_href->{$line_field}; my $op_match = ( $operator eq 'eq' ) ? $match : ( $operator eq 'ne' ) ? !$match : die "Unknown op '$operator'" ; return $op_match; }

If my guess as to your data is so far off that it makes my code useless to you, please run this code in your program:

print Dumper \@sorted_cols, \%conwithposition, \@array;
, trim the data down to the essentials, and post it. I will make a quick pass at revising my code.


In reply to Re: Handling Two-Dim Arrays by Util
in thread Handling Two-Dim Arrays by beginr

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.