in reply to Re: Compare fields in a file
in thread Compare fields in a file

From the responses, I see that I have not explained correctly. Let me clarify: I need to sort by descending amplitude, save the largest, remove any entries within +/- 1 second, then repeat on the next largest left in the list, ...
- honyok

Replies are listed 'Best First'.
Re^3: Compare fields in a file
by GrandFather (Saint) on Feb 10, 2009 at 23:10 UTC

    How about you take one of the plethora of solutions you have been provided that solve the problem for "I'd like to keep only the largest magnitude within each second.". Alter it to solve your actual problem, then show us the output you get and the output you want if you can't make it work?

    For future reference, providing a little sample data, your best attempt at coding the solution, your attempts' output, and the output you desire in your initial node actually saves everyone (especially you) a lot of time. An indication of why you want to perform a particular trick often helps us provide a better answer too.


    Perl's payment curve coincides with its learning curve.
Re^3: Compare fields in a file
by CountZero (Bishop) on Feb 10, 2009 at 22:24 UTC
    Easy, in my script, add
    sort { $b->[2] <=> $a->[2] }
    between the first map and the grep.

    The output will now be sorted by descending amplitude and you will have only one entry per second.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      Thanks gentlefolk. Great ideas!
      Elegant. I think I see how your code tests for each second,
      $date =~ m/(\d{2})-(\d{2})-(\d{4}) (\d{2}:\d{2}:\d{2})/;
      but my goal is to filter based on any arbitrary time increment(+/-1s,+/-5s,+/-300ms,...).

      Re Grandfather's comments: The ultimate point of this exercise is to avoid MS Excel monkeying with my time stamps. I have files with thousands of data points, each with many numerical attributes(x,y,z,date,time,magnitude,...). I would like to adapt a solution to sort and/or filter based on any or all of the attributes - time being the most difficult. Anything done outside of Excel saves time and aggravation.
      - honyok
        That can certainly be done. All you have to do is to replace the last map block by something like:
        map { ($date, $magnitude) = (split',')[3,4]; [$_, normalize($date), $magnitude] }
        Then you add a subroutine normalize which transforms your date-time string into a value (string or numerical) which has the same value for each "slot". For instance:
        sub normalize { my $value = shift; my ($hours, $minutes, $seconds) = $value =~ m/(\d{2}):(\d{2}):(\d{ +2}\.\d{3})$/; return int((1000 * $seconds + $minutes * 60 + $hours * 3600)/300); }
        As we now get a numeric result and not a string, we also have to replace the cmp in the sort block by a <=> to have the sort work correctly.

        With a little effort you can further parametrize the subroutine so you can define the "slot" in a more flexible way without having to rewrite the sub every time.

        CountZero

        A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James