Bonjour Monks.

I am working with some files containing a header then some columns of actual data, usually 7 or so data columns in total. Im interested in finding out 2 things about this data:

- the highest and lowest data values in columns 2 and 3

- the sequential step i.e the amount the value changes by between these data values (which is always constant).

Heres an excert of the data:

#CLIENT NAME #PROJECT NAME #TYPE #UNIT #FORMAT #DATE #FURTHER INFO #AS NECESSARY #CAN BE ENTERED HERE #CREATED BY # ZZ 3961 4081 0 1520 9876543 123456 ZZ 3961 4081 64 1520 9876543 123456 ZZ 3961 4081 128 1520 9876543 123456 ZZ 3961 4081 192 1520 9876543 123456 ZZ 3961 4081 256 1520 9876543 123456 ZZ 3981 4121 320 1550 9876543 123456 ZZ 3981 4121 384 1619 9876543 123456 ZZ 3981 4121 448 1769 9876543 123456 ZZ 3981 4121 512 1964 9876543 123456 ZZ 3981 4121 576 2201 9876543 123456 ZZ 3981 4121 640 2424 9876543 123456 ZZ 3981 4121 704 2639 9876543 123456 ZZ 3981 4121 768 2859 9876543 123456 ZZ 4001 4161 832 3033 9876543 123456 ZZ 4001 4161 896 3045 9876543 123456 ZZ 4001 4161 960 2909 9876543 123456 ZZ 4001 4161 1024 2732 9876543 123456 ZZ 4001 4161 1088 2654 9876543 123456 ZZ 4001 4161 1152 2657 9876543 123456 ZZ 4001 4161 1216 2655 9876543 123456

In this example I would want the results to show the following:

For File ABC.dat

The Max Value Column 2 = 4001

The Min Value Column 2 = 3961

The Min Value Column 3 = 4081

The Max Value Column 3 = 4161

The Step in Column 2 = 20

The Step in Column 3 = 40

Can this be done?!

As of now I am using a very convuluted and I think inefficient method. I am creating a new file with the header removed, then sorting the data on a īper columnī basis and exporting this to another new file, then printing the first and last line in this newest file to show the lowest and highest number for that column. e.g.

cat ABC_noheader.dat | awk '{print $2}' | sort -g > Column2.dat

followed by

awk 'NR==1;END{print}' Column2.dat

Not ideal but it does the trick eventually....though Im sure you will agree that there has to be a better way to do this, but at the moment Im just too dumb to know how!

Cheers. VDB V


In reply to Find highest and lowest numerical values for columns in a file by vdb

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.