dkhalfe has asked for the wisdom of the Perl Monks concerning the following question:

First off, I dont have any code to support this and am asking for an explanation as how to go about this procedure. I have a tab delim file containing the following data:

column relationship value filter_or_append order a <= 0.3 filter 1 b = abc append 3 c <= 0.3 filter 2
Basically, in the beginning of the program, I need to read in this filter file, sort through the order column and order from least to greatest, somehow store each value in a variable (to be later used in the program). Then the program will read in an input file which will be processed based on the criteria defined in the filter file. Ex. the program will first read in the filter file. For column a (order #1) the user wants the program to filter through that specific column for values less than or eq to 0.3. Then the program will move on to order #2. Then the program will move on to order #3 but for this specific column, will search for anything that eq 'abc' and append it to the output file. I have written the code to filter through any specified column but do not know how to store the values of the filter file in such a way that it can be used in the program. Thank you in advance for any help. If I need to clear anything up just let me know.

Replies are listed 'Best First'.
Re: Read in a file containing the criteria for a program when it runs
by brx (Pilgrim) on Jul 18, 2012 at 16:40 UTC
    #!/usr/bin/env perl use strict; use warnings; my @filter; my $line = <DATA>; #open (my $ffile, '<','filterfile') or die "cannot open filterfile: $! +"; #my $line = <$ffile>; #while ($line = <$ffile>) { while ($line = <DATA>) { chomp $line; my ($col,$rel,$val,$foa,$ord) = split /\s+/,$line; $filter[$ord] = [ $col,$rel,$val,$foa ]; } for my $ord (1 .. 3) { print "$ord:\n", "\tcol: " ,$filter[$ord]->[0],"\n", "\trsh: " ,$filter[$ord]->[1],"\n", "\tvalue: " ,$filter[$ord]->[2],"\n", "\tf_o_a: " ,$filter[$ord]->[3],"\n\n"; } __DATA__ column relationship value filter_or_append order a <= 0.3 filter 1 b = abc append 3 c <= 0.3 filter 2

    Each element of @filter is a reference to an anonymous array which contains values: [...].

    Produces:

    1: col: a rsh: <= value: 0.3 f_o_a: filter 2: col: c rsh: <= value: 0.3 f_o_a: filter 3: col: b rsh: = value: abc f_o_a: append

      Awesome!! Quick question. Did you mean to comment out the three lines under my $line = <DATA>. If so, why? Thanks again. Much appreciated.

        In this example, I use __DATA__ section and read it with <DATA>. This way, all is included in one file.

        But you will probably use an external file. To do that, just delete my $line = <DATA>; and while ($line = <DATA>) { (and __DATA__ section). Then uncomment the three lines.

      Excelent choice of data structure if $ord is always a small interger and there are few (if any) gaps in the sequence.

Re: Read in a file containing the criteria for a program when it runs
by aitap (Curate) on Jul 18, 2012 at 16:45 UTC
    You may want to use array of arrays (AoA) to parse and store this file. For example,
    #!/usr/bin/perl use warnings; use strict; use Data::Dumper; my @filter; <>; # read one line from the file/stdin while (<>) { # read other lines chomp; # remove "\n" from the end of the line push @filter,[(split /\t/,$_)]; # create an array of the line by spli +tting it by <TAB>, make a reference of it and push the reference to t +he @filter array } @filter = sort { $a->[4] <=> $b->[4] } @filter; # sort the array by 5t +h element of embedded arrays print Dumper \@filter;
    (EDIT: add chomp)
    Sorry if my advice was wrong.

      This method is also great. I know that I can print any specific clump of data. Ex.  print Dumper \@filter[0] to get the first clump of data in the array. My question is, can I (and if I can, how can I) pull out the individual elements in the array. Ex.

      $VAR1 = \[ '1000_Genomes', '<=', '0.03', 'filter ', ' '1 ];
      is the output I get when I print  Dumper \@filter[0] . I need to be able to use each individual element (1000_Genomes, <=, 0.03, etc..) separately throughout my program. Should I store these values in variables or what? Thanks for your assistance!!!!

        You can access elements by their nubmers. For example, print $filter[0]->[0],"\n" should print 1000_Genomes.

        -> is necessary because @filter array contents are not real arrays, but references to arrays.

        Sorry if my advice was wrong.