coding1227 has asked for the wisdom of the Perl Monks concerning the following question:

Dear PerlMonks, I have a big data file that I'd like to sort using perl. Do you guys have any thoughts on how to do this? Below I'm including a sample of what the data looks like.Any ideas/examples would be great... thank you! =)

Original Data: SATID 02 VAL1 32 VAL2 275 SIGNAL 43 SATID 24 VAL1 10 VAL2 098 SIGNAL 1 +2 SATID 41 VAL1 87 VAL2 180 SIGNAL 15 SATID 24 VAL1 41 VAL2 103 SIGNAL 56 SATID 02 VAL1 41 VAL2 154 SIGNAL 3 +1 SATID 41 VAL1 93 VAL2 124 SIGNAL 21 SATID 41 VAL1 23 VAL2 132 SIGNAL 23 SATID 24 VAL1 32 VAL2 034 SIGNAL 3 +2 SATID 02 VAL1 23 VAL2 145 SIGNAL 31 SATID 41 VAL1 63 VAL2 305 SIGNAL 62 SATID 33 VAL1 31 VAL2 174 SIGNAL 4 +4 SATID 02 VAL1 45 VAL2 205 SIGNAL 34 Desired Output: SATID 02 VAL1 32 VAL2 275 SIGNAL 43 SATID 24 VAL1 10 VAL2 098 SIGNAL 1 +2 SATID 41 VAL1 87 VAL2 180 SIGNAL 15 SATID 02 VAL1 41 VAL2 154 SIGNAL 31 SATID 24 VAL1 41 VAL2 103 SIGNAL 5 +6 SATID 41 VAL1 93 VAL2 124 SIGNAL 21 SATID 02 VAL1 23 VAL2 145 SIGNAL 31 SATID 24 VAL1 32 VAL2 034 SIGNAL 3 +2 SATID 41 VAL1 23 VAL2 132 SIGNAL 23 SATID 02 VAL1 45 VAL2 205 SIGNAL 34 + SATID 41 VAL1 63 VAL2 305 SIGNAL 62 SATID + 33 VAL1 31 VAL2 174 SIGNAL 44

Replies are listed 'Best First'.
Re: Sorting Data?
by tybalt89 (Monsignor) on Mar 31, 2017 at 02:43 UTC

    Like this? Your "Desired Output" is unclear.

    #!/usr/bin/perl # http://perlmonks.org/?node_id=1186568 use strict; use warnings; while(<DATA>) { my %items = reverse /(SATID (\d\d).{27})/g; print "@items{ sort keys %items }\n"; } __DATA__ SATID 02 VAL1 32 VAL2 275 SIGNAL 43 SATID 24 VAL1 10 VAL2 098 SIGNAL 1 +2 SATID 41 VAL1 87 VAL2 180 SIGNAL 15 SATID 24 VAL1 41 VAL2 103 SIGNAL 56 SATID 02 VAL1 41 VAL2 154 SIGNAL 3 +1 SATID 41 VAL1 93 VAL2 124 SIGNAL 21 SATID 41 VAL1 23 VAL2 132 SIGNAL 23 SATID 24 VAL1 32 VAL2 034 SIGNAL 3 +2 SATID 02 VAL1 23 VAL2 145 SIGNAL 31 SATID 41 VAL1 63 VAL2 305 SIGNAL 62 SATID 33 VAL1 31 VAL2 174 SIGNAL 4 +4 SATID 02 VAL1 45 VAL2 205 SIGNAL 34
      thanks for the starter point. I tried the code but it doesn't seem to be doing what I was hoping. Basically, the desired output data will be comprised of lines that have columns of data for each satellite (e.g.: SATID). The data for each satellite is contained between "SATID ... ... SIGNAL XX". The data for each new satellite will have a completely new set of rows assigned to it.When a SATID is repeated on the line below, it would be located immediately below the matching SATID found in the previous line
      In this way, I'm trying to keep the data for each satellite organized. Hope this makes sense =)

        Like this - each SATID has its own column in sorted order.

        #!/usr/bin/perl -l # http://perlmonks.org/?node_id=1186568 use strict; use warnings; my @secondpass; my %keys; while(<DATA>) { push @secondpass, $_; $keys{$_}++ for /SATID (\d\d)/g; } my @keyorder = sort keys %keys; for (@secondpass) { my %items = reverse /(SATID (\d\d).{27})/g; print join ' ', map { $items{$_} // ' ' x 35 } @keyorder; } __DATA__ SATID 02 VAL1 32 VAL2 275 SIGNAL 43 SATID 24 VAL1 10 VAL2 098 SIGNAL 1 +2 SATID 41 VAL1 87 VAL2 180 SIGNAL 15 SATID 24 VAL1 41 VAL2 103 SIGNAL 56 SATID 02 VAL1 41 VAL2 154 SIGNAL 3 +1 SATID 41 VAL1 93 VAL2 124 SIGNAL 21 SATID 41 VAL1 23 VAL2 132 SIGNAL 23 SATID 24 VAL1 32 VAL2 034 SIGNAL 3 +2 SATID 02 VAL1 23 VAL2 145 SIGNAL 31 SATID 41 VAL1 63 VAL2 305 SIGNAL 62 SATID 33 VAL1 31 VAL2 174 SIGNAL 4 +4 SATID 02 VAL1 45 VAL2 205 SIGNAL 34
Re: Sorting Data?
by salva (Canon) on Mar 31, 2017 at 07:09 UTC
    Most operating systems provide a sort utility that should be able to do that faster than perl (at least, faster if you parametrize it correctly to use the available RAM and CPU).
Re: Sorting Data?
by LanX (Saint) on Mar 31, 2017 at 02:24 UTC
Re: Sorting Data?
by madtoperl (Hermit) on Mar 31, 2017 at 10:48 UTC
Re: Sorting Data?
by 1nickt (Canon) on Mar 31, 2017 at 02:38 UTC

    (the posting page is reformatting this incorrectly for me)

    Hi, please edit your post and place your data inside <code></code> tags, so we can at least see where the lines end!


    The way forward always starts with a minimal test.
      thanks for the suggestion =) I just modified the original post so that you can see the desired output structure

        Why is there a SATID 33 after a SATID 41 ?
        What exactly are you sorting by?