perldummie has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks! I am extremely new to Perl; and need to write a script to automate following: This is my input:
Column1 Column2 Column3 2870 S11 1 574 S11 2 317 S11 3 31 S11 4 1 S11 6 1 S11 7 2925 S12 1 8 S12 5 1 S12 6 1 S12 9
I am trying to write a perl script that prints out the lowest and highest values of Column3. But since S12 has value 1 appear more often than S11 in column3 based on the value in column1, I need the script to give me as a result the line of S12 as the lowest column3 value (2925 S12 1 ). Is there an easy way to do this? Please keep the script as simple as possible so that I can actually follow what you are doing and learn from it. Thanks for your help!

Replies are listed 'Best First'.
Re: max and min values in 3-column file
by ikegami (Patriarch) on Feb 06, 2011 at 02:38 UTC

    I need the script to give me as a result the line of S12 as the lowest column3 value

    I presume you actually meant you want the later line in the event of ties. (Upd: No, I don't think that's what you meant. I don't understand what you do want, so I can't offer another solution. )

    $_ = <>; my @fields = split; my $lowest = my $highest = $fields[2]; while (<>) { my @fields = split; if ($fields[2] <= $lowest) { $lowest = $fields[2]; } if ($fields[2] >= $highest) { $highest = $fields[2]; } } print("$lowest\n"); print("$highest\n");
Re: max and min values in 3-column file
by jdporter (Paladin) on Feb 06, 2011 at 04:17 UTC
    since S12 has value 1 appear more often than S11 in column3 based on the value in column1

    How do you get that? As far as I can tell, in your example data, 1 occurs exactly once in column 3 for both S11 and S12, and exactly twice in column 1 for both S11 and S12. So what is the discriminator really? And what did you mean by "based on"? That's exceedingly vague.

    What is the sound of Windows? Is it not the sound of a wall upon which people have smashed their heads... all the way through?
Re: max and min values in 3-column file
by BrowserUk (Patriarch) on Feb 06, 2011 at 06:45 UTC

    My best guess as to your meaning is that column 1 is the frequency of the value in column 3.

    Here's one way that might be coded:

    #! perl -slw use strict; my @data; push @data, [ split ] while <DATA>; @data = sort { $a->[ 0 ] * $a->[ 2 ] <=> $b->[ 0 ] * $b->[ 2 ] } @data; print "Lowest: ". join "\t", @{ $data[ 0 ] }; print "Highest:", join "\t", @{ $data[-1 ] }; __DATA__ 2870 S11 1 574 S11 2 317 S11 3 31 S11 4 1 S11 6 1 S11 7 2925 S12 1 8 S12 5 1 S12 6 1 S12 9

    And the output is:

    c:\test>junk37 Lowest: 1 S11 6 Highest:2925 S12 1

    As both groups of data have 1 x 6 values, either (or both) are candidates for the 'lowest', so you'll have to decide which through some further qualification to the sort: eg. you might consider that in the event of a tie S11 is 'lower' than S12. Or, you might decide to display both (all) equally low values.

    Of course, if the set of values is large, especially if it is larger than memory, sorting is an inefficient way of find the largest and smallest. The main point is that (assuming I understood you correctly) that you need to multiply column 1 by column 3 when doing your comparisons.

    Feel free to ask questions about this code. It is easier to answer specific questions than guess what you might not know.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: max and min values in 3-column file
by jwkrahn (Abbot) on Feb 06, 2011 at 07:31 UTC

    Another way to do this by reading the file a line at a time:

    $ echo "Column1 Column2 Column3 2870 S11 1 574 S11 2 317 S11 3 31 S11 4 1 S11 6 1 S11 7 2925 S12 1 8 S12 5 1 S12 6 1 S12 9 " | perl -e' use warnings; use strict; my ( @lowest, @highest ); while ( <> ) { next if /^\D/; my @fields = split; unless ( @lowest ) { @lowest = @highest = @fields; next; } if ( $lowest[ -1 ] >= $fields[ -1 ] && $lowest[ 0 ] < $fields[ 0 ] + ) { @lowest = @fields; } if ( $highest[ -1 ] < $fields[ -1 ] ) { @highest = @fields; } } print join( "\t", "Lowest:", @lowest ), "\n"; print join( "\t", "Highest:", @highest ), "\n"; ' Lowest: 2925 S12 1 Highest: 1 S12 9
      Thanks for all of your replies. Yes, column 1 shows the frequency of the value in column 1. Could you please show me how to load my txt.file into your script, so that I can actually try it. I tried to do it the way I did it with my last script, but it somehow doesn't work. And as mentioned I am such a beginner in Perl that I don't know any other ways of how to upload txt files yet... Thanks so much to all of you for trying to help me!
        If you see a loop like
        while (<>){ ... }
        then it means that the script is reading from STDIN (standard input). You now need to find out how to compose a command line that passes your file on STDIN to the perl script, which is a nice little exercise... :)
        One of the solutions posted here used a DATA block at the end of the script, i.e. there is a signal to tell the perl interpreter that the rest of the file is not Perl code but data and the while loop reads from the data block line by line instead of STDIN.