in reply to schwartzian transform problem - Solved

An alternative to the ST is a GRT.

In a do block read the data with no line buffering from a filehandle (in this script a HEREDOC) and split into records at points not preceded by start of string (to avoid an empty first record) and followed by the ">>>" which starts each record. Each record passes into a map where the digits preceding the % sign are captured then packed as a 32-bit network order value (logical NOT applied as we want descending numerical order) concatenated with the whole record packed as a string. This is then passed to a simple lexical sort and then into a second map which unpacks the record by skipping the first four bytes which is the number used to sort. The script ...

use strict; use warnings; open my $fh, q{<}, \ <<__EOF__ or die qq{open: < HEREDOC: $!\n}; >>> prd1703 Filesystem Size Used Avail Use% Moun +ted on /workspace 3.9T 687G 3.2T 18% /wor +kspace/data >>> prd1701 Filesystem Size Used Avail Use% Moun +ted on /workspace 3.9T 887G 3.0T 13% /wor +kspace/data >>> prd1702 Filesystem Size Used Avail Use% Moun +ted on /workspace 3.9T 746G 3.1T 23% /wor +kspace/data __EOF__ print for map { unpack q{x4a*}, $_ } sort map { m{(\d+)(?=%)} && ( ~ pack( q{N}, $1 ) . pack( q{a*}, $_ ) ) } do { local $/ = q{}; split m{(?<!\A)(?=>>>)}, <$fh>; }; close $fh or die qq{close: < HEREDOC: $!\n};

The output ...

>>> prd1702 Filesystem Size Used Avail Use% Moun +ted on /workspace 3.9T 746G 3.1T 23% /wor +kspace/data >>> prd1703 Filesystem Size Used Avail Use% Moun +ted on /workspace 3.9T 687G 3.2T 18% /wor +kspace/data >>> prd1701 Filesystem Size Used Avail Use% Moun +ted on /workspace 3.9T 887G 3.0T 13% /wor +kspace/data

I hope this is of interest.

Cheers,

JohnGG

Replies are listed 'Best First'.
Re^2: schwartzian transform problem - Solved
by Cristoforo (Curate) on Feb 28, 2025 at 19:14 UTC
    Hi JohnGG, I'm trying to follow your solution using the GRT sort. I wonder why you have m{(\d+)(?=%)} where I might have used m{(\d+)%} without the positive lookahead for '%'?

      There's no difference since $& and others aren't used, but /(\d+)%/ should be faster.